PRM FDR control

PRM FDR control jfoe  2018-12-13

Dear skyline team,

I have been experimenting with the advanced peak picking model based on mProphet.
I have an assay with heavy internal reference and I would like to have q values for each peptide.

Now due to our quite tight scheduling for the PRM acquisition I can't imagine for scoring based on second best peaks to be of use.
When generating decoys however, they get +10 precursor mass so the would need another window during acquisition.

Do you think a workflow is possible, where one uses decoy transitions with the PRM data as is?
I think FDR in PRM is an important topic and I would like to put some effort into making this work.


Brendan MacLean responded:  2018-12-20

Hi Jonas,
I am not sure I fully trust mProphet scoring with super-tight scheduling anyway. I would probably start by proving to myself that I would get relatively similar results between tight scheduling and not. We a test with DIA and very tight scheduling and IDs below q value 0.01 dropped steadily as we widened the window. We decided to go with the wider window anyway, because we felt uncertain the detections with super narrow windows were valid.

Essentially, you are saying, I know what I am going to find right here and there it is. With a super tight scheduling window you are less likely to even get an untruncated peak in your scheduling range.

I guess we could consider what you suggest, but we'd need to exclude precursor-based scores from the models because without a shift, you are generally going to have exactly the same precursor peak. Not sure when I could promise to enable this. Probably not the answer you were hoping for, but we do have a lot going on. Sorry.


jfoe responded:  2018-12-21

Dear Brendan,

thank you very much for your response.
In fact you can already edit any skyline file to set decoy mass shifts to 0 and then just not include ms1 data with the assay.
In our data, we can not make out anything in ms1 at all so that is not an issue.
I would also assume though, that this is not really a great way to get unbiased scores.

As it looks now, I will probably try some deviations from the mProphet approach on my own.

One thing skyline does which is really problematic for me is that it will always keep the last AA fixed on decoy creation.
This is not suitable for the MHC epitope peptides that I am working on, where there are no set cleavage characteristics.
I would take the time later to write up another post where I could list some things that were limiting the use of skyline for our epitope peptides if you are interested.


Brendan MacLean responded:  2018-12-21

Hi Joe,
That last point is pretty interesting. I could see our getting better about only keeping the last AA constant if it matches your cleavage settings. That also seems like it might be useful to have a "None" value for the Peptide Settings - Digestion - Enzyme. That would somewhat limit your use of protein sequence information, but Skyline might still be able to make protein associations with a background proteome. Do you care about protein associations, or are you simply using peptide lists to get around cleavage?

We are always looking for ways to improve the software and it never hurts to know more about where we could improve, even if it will take us time to get the improvements made.

Thanks for your feedback. Please do clarify your thoughts on where we might improve for your use case.


jfoe responded:  2019-01-01

Dear Brendan,

We just use peptide lists and don't care about protein associations.
One thing though is that we really care about individual peptides even if they have some problematic characteristics.

It took me some time to reproduce of few things I encountered.


We are using heavy reference peptides for our assays.
Due to the irregularity of our peptides we would receive them with a single heavy AA at a specified location in the peptide.
Note the presence of a heavy L as well as a light L.
If I paste this into skyline I will not get any variable modifications (like M[+16]) applied.
This is in contrast to pasting PEMPLTIDLME.
It's clear to me why that is the case but the result is that I have to generate all variable modifications externally and paste it like so:

A workflow for peptide import where skyline just does this would be great of course.


Also I wanted to look into the collisional dissociation of our peptides in detail.
For this I activated neutral losses for loss of water and amonia loss.
When I then imported my peptides I would get countless duplicated transitions.
I have attached an example skyline file for this (neutral_loss_issue).
In this file there is for example 4 times: L [y8 -51.1] - 891.4822+

Importing results with this would then yield eg:

At 01:34:
Duplicate transition 'L - y8+' found for peak areas

Of course these transitions are not really unique in a strict sense but it would be great is this was handled more gracefully.
The issue was created by just pasting the string "ALNEKLVNL" into the empty target list of the example file.

Also, if you open the example file, go to peptide settings, and disable the structural modification for water loss, you will trigger:

Unexpected Error
An item with the same key has already been added.

We are using a Q Exactive and thermo seems to encourage the use of normalized collision energy.
When I try to do collision energy optimization with normalized energies, skyline would import the .raw file but would not recognize that the various spectra are based on different collision energies.

I am using Skyline
These are some of the things that I would love to have some help with.

Best wishes for a happy new year,