Ion mobility and mProphet scoring

hets

2021-08-11 19:46

Hello,
I am analyzing some FAIMS data to quantify heavy/light ratios but uncertain of the order of operations to obtain mProphet q-values.
When I apply ion-mobility settings to select the best CVs prior to adding decoy peptides, the mProphet model training does not fail and the decoy vs target distributions are clearly separated, but if I apply IM setting after adding decoys (followed by re-importing) it seems CVs are also picked for decoy peptides and the training model fails.

Any suggestions on what to do?
Thank you!

Using DDA peptide search with the latest Skyline release

Nick Shulman responded:	2021-08-12 17:59
I do not know the answer to your question but if you send us your Skyline document we can take a look. In Skyline you can use the menu item: File > Share to create a .zip file containing your Skyline document and supporting files including extracted chromatograms. If that .zip file is less than 50MB you can attach it to this support request. Otherwise, you can upload it here: https://skyline.ms/files.url It might be helpful if you could send us two copies of your document: one where you applied the IM settings after adding the decoys, and the other one. I am not exactly sure what you mean by "Apply the IM settings". There are certain things that you definitely should not do after adding the decoys (or, if you do them, you need to add the decoys again). When you use the "Add Decoys" menu item, Skyline gives you a list of peptides which have different masses, but share a lot of other characteristics such as (predicted retention time) with the target peptides that they were generated from. If you were to then do something to change the target peptides, and that same change was not applied equally to the decoys, then there would be a difference between the targets and decoys. If you were then to train a peak scoring model, the trained model might heavily weight a particular score because it happens to provide very good separation between the targets and the decoys, but that would not actually help the model distinguish between true and false peaks. If we see your Skyline documents we will probably be able to tell you whether anything invalid has happened with the targets and decoys. -- Nick

hets responded:	2021-08-13 10:25
Thank you, I have uploaded files in the file share "hets_081321" 1.1: mProphet works 1.2 : looks like CVs are selected for decoys and model training fails I've also just uploaded a "2.zip" files which is a technical replicate of the '1' samples. This one seems to fail model training with either way.

Nick Shulman responded:	2021-08-13 12:41
Thank you for sending those files. It looks like you have already done a peptide search on your raw file, and the results of that peptide search are stored in your .raw file. I believe that the peptide search that you have done is going to be much more accurate than any sort of peak scoring that Skyline would be able to come up with, because your peptide search engine was able to look at MS2 spectra (DDA?) whereas all Skyline has to look at are the MS1 chromatograms that you have extracted. If you did want to train a model on this data, I believe you would have to remove all of the peptide search related information from the document, since it otherwise it would end up biasing Skyline. That is, if Skyline were to look at the chromatograms near where there was a positive ID, Skyline will see that there is a peak there with the same mass as your target peptide, but not the same mass as your decoy peptide, but the only reason there was a positive ID there is that your peptide search engine was looking at the same thing. (I hope this makes sense). I have another question about your data. I see that you have a heavy labeled Cysteine. Is this a SILAC experiment where you have combined two comparable biological samples where one of those samples has been treated so that the Cysteines are heavy labeled? Or, are those heavy labeled peptides things that you spiked into your samples. If this is a SILAC experiment, then I would recommend that you go to "Settings > Peptide Settings > Modifications" and change "Internal Standard Type" to "None". When Skyline sees that you have an internal standard, Skyline only looks at the heavy (or whatever the standard is) chromatograms when doing peak detection. The idea is that you have spiked the heavy peptides in at a level that will be easy to detect in all of your samples, whereas some or all of your samples may have undetectable amounts of the light peptide. In summary, I don't believe that you should train a peak picking model on DDA data. The peptide search results that you already have are based on better information than Skyline can see, and Skyline will never be able to do as good a job as your peptide search engine. I am not very familiar with ion mobility and compensation voltages in Skyline, so someone else might have some advice related to that. Hope this helps, -- Nick

hets responded:	2021-08-13 14:50
Thank you! Yes, these are cystines labeled with heavy and light reactive handles--similar to SILAC I am using Skyline for Light to Heavy quantification and I only need the final ratios for comparison purposes. But, if I understand correctly, I would not need to include mProphet scores since these peaks are already identified by my search engine? (in which case I will only use the 'RatioLightToHeavy' results after incorporating the ion mobility filtering) It seems having the internal standard set to 'none' fixes the issues with both 1.2 and 2 however.