Not seen glycopeptides working with data from MSFRagger (DDA MS1 filtering) cpavan  2023-01-18 12:31
 

Dear Skyline staff,

I’m trying to analyze a dataset by DDA MS1 filter to compare XIC of glycopeptides. The peptide search comes from MSFragger – glycol search –. I can create a library with different peptides from the proteins ID but I can see the glyco peptides identified by MSfragger, which is actually my interest in the analysis.
I have tried to add the glycosylation by peptide setting but Still, I only see unmodified peptides ( only Carbamidomethylation and met ox -.
Any suggestion to solve that?

thank you so much in advance,

Carlos

 
 
Nick Shulman responded:  2023-01-18 12:54
I do not understand your question but if you send us your Skyline document we might be able to figure it out.
In Skyline you can use the menu item:
File > Share
to create a .zip file containing your Skyline document and supporting files including the spectral library that you probably created from your peptide search results.

If that .zip file is less than 50MB you can attach it to this support request. You can upload larger files here:
https://skyline.ms/files.url

-- Nick
 
cpavan responded:  2023-01-18 13:37
Dear Nick,

Thanks so much for your prompt response.
I’m newbie using Skyline and my explanation was a little confusing.
I’m attaching the files you suggested to me. The .zip is Skyle Question.zip. ( by error I upload an uncompressed file also that you can discard, please).
You will find the following files:
*Psm.tsv : this is the result from MSfragger that contains the glycol-PSM as well PSM from several .raw files. This is the data I’m trying to analyze by skyline. But as it doesn’t support this file, I upload the .pepXML from one raw file just to try.
*200921_SAM_OC107_Elute1_Slice4.pepXML: the search result that I could upload in Skyline
The rest of the files are related to skyline.
Please, let me know if you can fin them.
Thanks in advance,

Carlos

PS: just to clarify a little bit: My goal is to have the XIC of glycopeptides, let's say PEPTIDE+HexNac and PEPTIDE+HexNAcHex to compare. But in the Skyline library, I can't see any glycopeptide, just peptides without modification ahead of Cys CAM or Met ox.
 
Nick Shulman responded:  2023-01-18 14:21
It sounds like you were hoping that there would be some peptides in your library that had the "+203.08" modification on them, but no peptides like that made it into the library.

There are a couple of things that might be going wrong:

1. There are PSMs with that modification on them, but all of those PSMs are "ambiguous".
By default, ambiguous spectra are not included in the library. An ambiguous spectrum is a spectrum that had more than one confident peptide spectrum match. Sometimes, if you have asked the peptide search engine to look for many modifications that are similar to each other, the peptide search engine will assign many different peptides to the same spectrum, because the peptide search engine can't tell which exact modification is on that peptide. With your data, it looks like you have told the peptide search engine that every amino acid could potentially have the +203.08 modification on it. For this reason, I imagine that for any spectrum that looks like it has a +203.08 modification on it, your peptide search engine has given you many hits representing each of the possible positions that the +203.08 might be applied to in the peptide sequence.
If you want to include matches from ambiguous spectra then you should check the checkbox that says "Include ambiguous matches".

2. The peptide spectrum matches had scores that were too bad to be included in the library
When you are building the library, you can specify the "Score Threshold" to use, and PSMs that scored worse than that do not get included in the .blib file.
The PSMs in your pep.xml file are using the "X! Tandem expectation" score, and if you set the score threshold to 1, then all peptide spectrum matches will be included in the .blib file. Note that it is usually a bad idea to build a library with a score threshold of 1, since you end up with a lot of spectra which are very unlikely to be correct, but this is a good way to figure out whether the score threshold is the reason that you are not seeing the PSMs that you want.

I can't build a library from your .pepXML file because I don't have the .mzML (or whatever) file that you searched, so I can't be sure what is going on, but it's probably one of these two things.
-- Nick
 
cpavan responded:  2023-01-25 09:41
Dear Nick,

Thank you for your comments from the last week, unfortunately, I'm still stuck in the same place. I have tried clicking on ""Include ambiguous matches" and nothing have changed.
In addition, I'm not sure where is the field for setting the "score threshold" you mentioned (Is it in the first window when building the library?).
when I go to setting and Add peptide modification as Hex and HexNAc HEx I have modified peptides that I don't have in my search result.
Is there any possibility to upload the .raw and maybe you can try to build the library to figure out what is going wrong?

Thanks in advance for your help,
 
Nick Shulman responded:  2023-01-25 09:58
Yes, you can upload all of your files to that same spot:
https://skyline.ms/files.url

The place to set the "Score Threshold" is in a grid.
If you are using the "File > Import Peptide Search" wizard, then the grid shows up after you have told Skyline which peptide search results files you want to use. Skyline fills in the rows of the grid with the name of the file, the "Score Type" and the "Score Threshold". You can change the number in the Score Threshold column.

-- Nick
 
cpavan responded:  2023-01-25 11:43
Thanks, Nick

I have uploaded three files:
200921_SAM_OC107_Elute1_Slice4.RAW
200921_SAM_OC107_Elute1_Slice4.pepxml ( from MSFragger)
psm.tsv ( the other output from MSfragger, but I can't load that one: is it not compatible with Skyline?)

Unfortunately, changing the "Score Threshold" I could get the glyco PSM.

thanks,
 
Nick Shulman responded:  2023-01-25 13:31

I am not an expert on pep xml files, but it looks like the PSMs in the pep xml file are very different from the rows in the "psm.tsv" file.

For instance, here is something from line number 43819 from "psm.tsv" which looks like it is a peptide with the +203.079 modification:

Peptide Assigned Modifications
PAATKPATTKPMVK 12M(15.9949),1P(203.0794)

but here is what that looks like in the pep.xml file:

<spectrum_query start_scan="6166" uncalibrated_precursor_neutral_mass="1658.8953" assumed_charge="3" spectrum="200921_SAM_OC107_Elute1_Slice4.6166.6166.3" spectrumNativeID="controllerType=0 controllerNumber=1 scan=6166" end_scan="6166" index="2143" precursor_neutral_mass="1658.8948" retention_time_sec="1314.2492294311523">
<search_result>
<search_hit peptide="PAATKPATTKPMVK" massdiff="203.0830078125" calc_neutral_pep_mass="1455.8118" peptide_next_aa="M" num_missed_cleavages="2" num_tol_term="2" protein_descr="Collagen alpha-3(VI) chain OS=Homo sapiens OX=9606 GN=COL6A3 PE=1 SV=5" num_tot_proteins="1" tot_num_ions="52" hit_rank="1" num_matched_ions="9" protein="sp|P12111|CO6A3_HUMAN" peptide_prev_aa="K" is_rejected="0">
<modification_info modified_peptide="PAATKPATTKPM[147]VK">
<mod_aminoacid_mass mass="147.0354" position="12"/>
</modification_info>
<search_score name="hyperscore" value="18.976"/>
<search_score name="nextscore" value="0.000"/>
<search_score name="expect" value="6.993e-05"/>
<ptm_result localization="" best_score_with_ptm="13.720062" score_without_ptm="18.9769" localization_peptide="PAATKPATTKPMVK" second_best_score_with_ptm="12.419948" ptm_mass="203.07938"/>
</search_hit>
</search_result>
</spectrum_query>

Those aren't remotely similar so I am not sure what could be going wrong, but maybe someone else on this support board might have an idea.
-- Nick

 
cpavan responded:  2023-01-25 14:02

thanks, Nick,

So basically, skyle doesn't support the psm.tsv file to work with? This is actually where I take the PSM from MSFragger.
thanks,

Carlos

 
Nick Shulman responded:  2023-01-25 14:33
This page shows which file types BiblioSpec can handle. It says that for MSFragger you need to use the pepxml files.
https://skyline.ms/wiki/home/software/BiblioSpec/page.view?name=BlibBuild

The developers of MSFragger have posted many questions on this support board, so I believe that Skyline usually works with MSFragger pepxml files.
I don't know what might be different about your search results compared to the support results that usually work.

If you would like I could email the msfragger people and ask them to take a look at this support request. Can you give us more information about how you created the files that you have sent us?
-- Nick
 
cpavan responded:  2023-01-26 08:09
Dear Nick,

Thank you so much for your help and yes, go ahead and ask the MSFragger developers about this issue, please.
As a summary: I have taken some raw files from an already published study ( Malaker et al, 2022) to compare the Metamorpheus and MSfragger output. I ran MSFRagger in the O glyco search workflow with the default glyco database. I have attached a param file from MSfragger.
With these results from MSFragger I was interested in working with Skyline for glyco site intensities comparison as it is described in Reiding et al, 2019.
Thank you,

Carlos
 
Nick Shulman responded:  2023-01-26 21:58

This is the answer that I got so far:

In the pep.xml file, the PSMs are from MSFragger and the glycosylation information is in the mass diff not in the variable modification. After Philosopher generates the psm.tsv, there is another tool, PTM-Shepherd, in FragPipe that modifies the psm.tsv to refine and append the glycosylations as a variable modification. I think that is why you see different modifications between pep.xml and psm.tsv files.

BiblioSpec supports a simple tab-separated-value format where you just supply the file name, spectrum number, charge and peptide sequence.
The format is documented on this page:
https://skyline.ms/wiki/home/software/BiblioSpec/page.view?name=BiblioSpec input and output file formats
The format is called "ssl" which stands for "spectrum sequence list".

In theory, it would be straightforward to convert the psm.tsv file that you have into the ssl format that BiblioSpec can handle. Unfortunately, BiblioSpec is expecting a column with a modified sequence with mass differences in square brackets after the amino acids. There is no column that looks like that in psm.tsv, but I wrote an Excel VBA function which generates that from the contents of the "Peptide" and "Assigned Modifications" columns.

I made a .ssl file from your psm.tsv file which I have attached. You can use this file to create a spectral library.
I made a library from just the Slice4 results since that was the only raw file I had.

It's likely that I made a mistake in my Excel macro, so, let me know if the modifications in the "Slice4.blib" library are on the wrong amino acid positions or anything like that.
-- Nick

 
cpavan responded:  2023-01-27 13:47

Dear Nick,

Thank you very much for your help. I think that your macro is working ok, some missense PSM come from PSM (probably from some wrong settings).

I will work with the .ssl, can I use your macro?

Thanks you,

Carlos