Retention times do not fit with scan times in ID picker

support
Retention times do not fit with scan times in ID picker staab-weijnitz  2023-01-27 10:03
 

Hi all,

I want to use Skyline for MS1 filtering and quantification of collagen PTMs from crude samples, conceptually similar to what we did here:
https://doi.org/10.1016/j.mbplus.2019.04.002

Right now, I am trying to analyze crude cell ECM from cells where I knocked down a gene (n=2, four samples in total). This is basically a test run, to see whether we get enough coverage of collagens and other proteins of interest.

We use MyriMatch (2.2.19172) for peptide search of thermo .raw files and obtain results in .pepXML format and check them in ID picker (3.1). After the "fine PTM search" where we restrict the search on proteins present in the sample and include decoys in the fasta file, I loaded a merged session of my four files in IDpicker and exported the spectral library (.sptxt file) from the merged idpDB file.

I imported this .sptxt file into Skyline ("Use existing library") and then loaded the four thermo .raw files after conversion to .mzMXL via MSConvert (3.0). This seemed to work.

However, now, when I crosscheck the IDs and the retention times I get in Skyline, they are most of the time completely off compared to the scan times for the same peptides in ID picker.

Is there anything obvious I have been doing wrong? I would appreciate your help on this. Happy to upload files or give more information, of course, too, if needed.

Many thanks,
Claudia

 
 
Nick Shulman responded:  2023-01-27 14:48
The usual reason that I have seen retention times being wrong is that scan numbers are being misinterpreted. For instance, the peptide search engine might have been searching a file which only contains MS2 spectra, and the spectrum numbers that it puts into the search results are indexes into the list of MS2 spectra, whereas some other tools thinks those numbers are indexes into the list of MS1 and MS2 spectra.
Other times, we have had bugs where minutes were interpreted as seconds or vice versa.

If you send us your files, we might be able to figure out what is going wrong.

In Skyline you can use the menu item:
File > Share
to create a .zip file containing your Skyline document and supporting files including spectral libraries and extracted chromatograms.

It sounds like you would also need to send us a bunch of other files so we can see what is going on.
You can zip up all those files and upload them here:
https://skyline.ms/files.url

-- Nick
 
staab-weijnitz responded:  2023-01-30 00:31
Dear Nick,
many thanks for your fast response. I am uploading the Skyline document now, named "Staab-Weijnitz_retentiontimes_issue.zip".
What other files would you need, the results of the MyriMatch search, the IDpicker database, or the Thermo raw files or all of them?
Many thanks again,
Claudia
 
staab-weijnitz responded:  2023-01-30 05:04
Hi Nick, sorry the upload takes ages. I will write another response here when it is done, O.K.?
Thanks,
Claudia
 
staab-weijnitz responded:  2023-01-31 05:17
Hi Nick,
I am having trouble uploading the file because of unstable Wifi at the institute.
Can you try downloading the zip file from here?
https://1drv.ms/u/s!AtiPvz3_ZNwQgr5NKQGHLj-B3HKuJg?e=ScscbR
I hope this works!
Otherwise, I'll try via another cloud.
Best,
Claudia
 
Nick Shulman responded:  2023-01-31 09:00
Thank you for sending that file.
I see that you are using a .sptxt library. It looks like Skyline does not know how to read retention times from .sptxt files, so there ends up not being any retention time information at all.

It would be better if you could create a BiblioSpec (.blib) spectral library from your peptide search results.
Here is the web page which says which types of peptide search results you can use to build a BiblioSpec spectral library:
https://skyline.ms/wiki/home/software/BiblioSpec/page.view?name=BlibBuild

That page says that for IDPicker results, you should use the idpXML files.
Do you have idpXML files?

You can import your peptide search results into Skyline using the menu item:
File > Import > Peptide Search

-- Nick
 
staab-weijnitz responded:  2023-01-31 11:41
Hi there,

Thanks, that is already helpful. Unfortunately I do not have .idpXML files and do not know how to generate those.
I have .idpDB and .pepXML files. As spectral library, the only format I can export is .sptxt as far as I can see.

In this thread https://skyline.ms/issues/home/issues/details.view?issueId=378 it is stated that IDpicker may also export .blib format, but I do not see that option anywhere.

There have been some updates - I download the newest version of ID picker now and check again.
Update will follow later.

Thanks again,
Claudia
 
staab-weijnitz responded:  2023-01-31 12:35
Newest version did not help.
Still .idpDB, .pepXML, and .sptxt is all I have.

Any idea what could be done to get .idpXML files?

Thanks,
Claudia
 
Matt Chambers responded:  2023-01-31 12:57
Hi Claudia,

BiblioSpec's IDPicker importer was written for IDPicker 2.x when it used .idpXML files. IDPicker 3.x uses .idpDB files for much greater scaleability. The IDPicker developers[1] never got around to writing a blib exporter for IDPicker or an idpDB importer for BiblioSpec. I think IDPicker's sptxt exporter was primarily intended for spectral library searching with tools like Pepitome or SpectraST, but if all that's missing for the sptxt to be useful in Skyline is fixing reading the scan times, then that's probably the easiest path forward. Do you agree Nick or are there other limitations that make the sptxt undesirable?

-Matt


1. That was mostly me but I haven't been actively developing it since 2019-ish.
 
staab-weijnitz responded:  2023-01-31 23:31
Hi both,
it would be amazing if something like that worked. Please let me know if there is anything I can do to help.
Many thanks,
Claudia

PS: BTW I am relieved to learn that I do not have to go to another support forum to get help on IDpicker for this issue. ;) 2019-ish sounds like it could have something to do with our work on the collagen PTMs in MB+ that I referred to above? :D
 
staab-weijnitz responded:  2023-02-03 11:48
Dear Nick,

In the meanwhile I am trying to parse my peptide library through peptide prophet (within TPP) to produce files that Skyline will accept. We had managed to do that before in 2018 (not me directly, though). However, now I get the message:
_____________

(MyriMatch)
WARNING!! The discriminant function for Myrimatch is not yet complete. It is presented here to help facilitate trial and discussion. Reliance on this code for publishable scientific results is not recommended.
WARNING: Myrimatch only support semi-parametric PeptideProphet modelling, which relies on a DECOY search.
init with MyriMatch Trypsin/P
____________________________

Don't get me wrong, I do not expect that you can help with MyriMatch and/or PeptideProphet, I just post it in case this rings a bell for you and it is an easy fix to generate files for Skyline through peptide prophet. I really need this to work, don't hesitate to let me know should you have any ideas of another workaround.

Thanks and best,
Claudia
 
Nick Shulman responded:  2023-02-03 14:56
I think Matt also wrote MyriMatch.

Those two warning messages look like they might just be telling you that the numbers might not be reliable. They do not look like they would cause the whole process to not work. Did you see any other messages? Maybe you could send us your complete log output.

Is there a reason that you are trying to use these very old peptide search engines? We might be able to recommend something newer that will be easier to get working.
-- Nick
 
staab-weijnitz responded:  2023-02-05 23:48
Dear Nick,

Thank you for your message.

There are two reasons; a) it is what we successfully used before in the paper I mentioned above, and b) Myri Match has a motif search feature which is very useful for collagen PTMs as prolyl and lysyl hydroxylations typically occur in specific sequence motifs (GXY repeats).

In a nutshell, I need to search the raw data for hydroxylations of Pro, where [Pro+15.994915] is in the Xaa or the Yaa position of the Gly-Xaa-Yaa repeats in collagen sequences. Similarly, I need to search for hydroxylated and glycosylated lysines in the Yaa position of the Gly-Xaa-Yaa repeats [Lys+15.994915; Lys+178.047738; Lys+340.100562].

Either way, I am very happy to use another tool if this can be done with something that is more compatible with Skyline.

Many thanks,
Claudia
 
Matt Chambers responded:  2023-02-06 09:52
MyriMatch doesn't calculate an FDR score so it can't be directly used in a tool like BiblioSpec/Skyline which require some threshold to cut off results. As long as you did a target-decoy search, PeptideProphet should work to figure out those thresholds and add a probability (discriminant) score. BiblioSpec supports importing many kinds of workflows through PeptideProphet. Did running it through PP not work for you?
 
staab-weijnitz responded:  2023-02-06 10:25
Hi Matt,

I did not include the reverse sequences for each protein in the fasta file (UniProt database) when I did the crude search (allowing only fixed modifications like Cys acetamidylation). But I did include reverse sequences in the fine search (PTM search) from the subset of proteins identified in the crude search (subset FASTA exported from IDpicker).

Then after renaming the resulting pepXML files into pep.xml, I tried to run the files through PeptideProphet as follows:

(Folder where I have the pep.xml files and the program within TPP):\PeptideProphetParser.exe *.pep.xml set DECOY=rev set ACC set NONPARAM

But then I get this message and no results:
_____________
(MyriMatch)
WARNING!! The discriminant function for Myrimatch is not yet complete. It is presented here to help facilitate trial and discussion. Reliance on this code for publishable scientific results is not recommended.
WARNING: Myrimatch only support semi-parametric PeptideProphet modelling, which relies on a DECOY search.
init with MyriMatch Trypsin/P
____________

Should I have included reverse fasta sequences already in the crude search? I can rerun this tonight?

Many thanks,
Claudia
 
Matt Chambers responded:  2023-02-06 10:33
How did you get IDPicker to import the crude search without decoy hits? I'd suggest analyzing the crude and subset/PTM searches separately, so they'll both need decoys. Also that PP warning about semi-parametric seems incompaible with your NONPARAM setting, but my knowledge of running PP is almost certainly less than yours.
 
staab-weijnitz responded:  2023-02-06 10:47
O.K., I'll rerun both analyses with decoys and get back to you.
Thanks,
Claudia
 
staab-weijnitz responded:  2023-02-06 16:16
Hi Matt,
thank you for your suggestions, at least I got rid of one of the error messages.

Nevertheless, using:
>PeptideProphetParser.exe *.pep.xml set DECOY=rev set ACC set NONPARAM

I still get the following message:
__________
Using Decoy Label "rev".
Using non-parametric distributions
 (MyriMatch)
WARNING!! The discriminant function for Myrimatch is not yet complete. It is presented here to help facilitate trial and discussion. Reliance on this code for publishable scientific results is not recommended.
init with MyriMatch Trypsin/P
__________

It seems like Peptide Prophet is missing something to get started? I am lost. Will continue tomorrow.

Thanks and best,
Claudia
 
Matt Chambers responded:  2023-02-07 07:01

Well it's clearly a warning message rather than an error. What kind of output do you get?

 
staab-weijnitz responded:  2023-02-07 07:26

Good question. If only I knew. :D

There is no output as far as I can see, that's the thing. I also thought there must be something generated in spite of the warning.
I even tried "set FORCEDISTR" (or sth similar, don't remember the exact command now)
I was looking for output in and around all TPP folders on my computer but I don't see anything.

Well, I am not too familiar with using programs without graphical interface, so I feel it may very well be something very, very stupid that I don't get.
But the .pep.xml files that I tried to run through PeptideProphet remain entirely unchanged and still do not load into Skyline.

/Claudia

 
staab-weijnitz responded:  2023-02-07 13:13

Hi there,

I uploaded a zip file with my .pepXML files on your file sharing folder. It is called "MEJ8890A8MA4__PeptideProphetissue.zip".
Would be great if you could have a look whether there still is something wrong with the files.

Before I run PeptideProphet, I exchange the file extension, just as we described in https://doi.org/10.1016/j.mbplus.2019.04.002, from pepXML to pep.XML. Indeed, when I run the pepXML files, I get the error "fin: error opening <filename>"
When I run the same files with pep.XML extension, I get the following warning and no visible data output.

"WARNING!! The discriminant function for Myrimatch is not yet complete. It is presented here to help facilitate trial and discussion. Reliance on this code for publishable scientific results is not recommended.
init with MyriMatch Trypsin/P"

Many thanks for your help so far,
Claudia

 
staab-weijnitz responded:  2023-02-09 01:42

Ni Nick and Matt,

Overall, I see three options now:

(1) Based on what Matt posted earlier: Fixing reading the scan times from the .sptxt file exported from ID picker. Is this feasible?
(2) Figure out how to add a probability (discriminant) score via PeptideProphet...
(3) Use different search engines to start with.

I would be very grateful for guidance on what you believe is the best way to proceed.

Many thanks,
Claudia

 
Nick Shulman responded:  2023-02-09 10:37
I think you might have misunderstood what we were asking for when we asked for the "output".
I am not sure exactly what it looks like when you run PeptideProphet, but I imagine a bunch of text scrolls by on the screen, and that text included the message "WARNING!! The discriminant function for Myrimatch is not yet complete..."
All of that text with warnings, errors and messages is the "output". Was there a bunch of text near that which you could copy and send to us?

If there is not a bunch of text there that you can send to us, maybe you could send us a picture of what the screen looked like when you got the Myrimatch discriminant function warning.
-- Nick
 
staab-weijnitz responded:  2023-02-09 11:22
Aah O.K. - There is not more text than:
____________
Using Decoy Label "rev".
Forcing output of mixture model
Using non-parametric distributions
 (MyriMatch)
WARNING!! The discriminant function for Myrimatch is not yet complete. It is presented here to help facilitate trial and discussion. Reliance on this code for publishable scientific results is not recommended.
init with MyriMatch Trypsin/P
______________

See attached screenshot.
Thanks,
Claudia
 
staab-weijnitz responded:  2023-02-14 14:13
Dear Nick,
as stated above, it seems that there are three options now:

(1) Based on what Matt posted earlier: Fixing reading the scan times from the .sptxt file exported from ID picker. Is this feasible?
(2) Figure out how to add a probability (discriminant) score via PeptideProphet... (I posted on a support/discussion forum, but no reaction so far)
(3) Use different search engines to start with.

I would be very grateful for guidance on what you believe is the best way to proceed.

Many thanks,
Claudia
 
staab-weijnitz responded:  2023-02-20 09:11
Hi all,

FYI: it seems like I finally solved the issue!

(1) the original data files contained a space and several dashes (not generated by myself, I never thought about that)
-> I removed all spaces and dashes from the thermo .raw files and started all over again with the MyriMatch search, using decoy fasta files for both crude and fine (PTM) search
-> I also converted the renamed thermo .raw files into .mzXML files using MS Convert on the TPP Petunia interface

(2) I found out that I can run PeptideProphet from the TPP Petunia interface as well (ran it from command line before)
Renaming the .pepXML files into .pep.xml files allowed for running them through PeptideProphet using accurate mass and non-parametric modeling. Eventually I understood that this required to have the converted .mzXML files in the same folder.
Finally, I got output files with the prefix "interact-" and the file extension .pep.xml

(3) I could load the latter files and the corresponding .mzXML files to build a spectral library into Skyline and now, retention times fit very well with Scan times in ID picker!

Does this make sense to you? I think it looks good!

Cheers,
Claudia