Removing proteins without unique peptides

Removing proteins without unique peptides Giovanna  2018-10-25

Dear Skyline team,
first of all I apologize in advance for the chance that my questions are stupid, in my defence it's the first time I manage proteomics data.
I'll briefly explain the steps I've followed.

I've built a spectral library by choosing the option ''Import DDA peptide search'', and importing a msms.txt file from Maxquant.
As last setting I've chosen ''remove duplicate peptides'' since I want to keep only proteins identified with unique peptides. I've obtained 1230 proteins. At this point I could see the chromatograms of the precursor ions in my DDA runs, but not their corresponding MSMS transitions. Here it comes the first question: why? I've already selected View -> Transitions -> All.
Then I've tried to import my DIA runs from File -> Import -> Results, but I still couldn't see no transition.
So I've tried from another path: Blank document -> Settings (where I've inserted the peptide and transitions parameters I want for my DIA files, is that correct or does it affect the following import of the library?) -> View -> Spectral library (where I've chosen mine) -> Add all. The library's peptides appeared, but not the matched 1230 proteins I expected. In order to get them I had to import the FASTA again (File -> Import -> FASTA). At this point the proteins were more than 8000. My explanation for this was that I hadn't filtered the duplicate proteins jet (but if I had to do that again why did I have to put a FASTA in building the library and choose which proteins to exclude? If these protein settings and filters are then lost I mean). I imported my DIA runs through File -> Import -> Results and the proteins' number remained the same (reasonable) but it was then finally possible to select the precursor and the transitions and see them in the chromatograms of the DIA runs. I guess I can find into the DIA runs no more proteins than the ones cointained in my library, so I wanted to clean the proteins in order to keep the ones with unique peptides, and I expected to obtain maximum 1230 proteins. I selected Edit -> Refine -> Remove duplicate peptides, but it happened that all the proteins in the left panel appeared without the list of their precursors and transitions, and their number (that was actually reduced) was superior than 1230. What did I do wrong or not understood?
Furthermore, as last questions, idotp is Isotope dot product I think, but what is dotp? Can I get it in the report and, foremost, can I rely on those parameters in order to trust an identification without manually check it? We haven't used standard peptides for iRT normalization, but we are going to.
Thanks for your patience and attention.
Best regards!


Nick Shulman responded:  2018-10-25
If you do "Import DDA Search", then Skyline gives you a document that is set up to extract MS1 chromatograms from DDA files.
If you change your mind, and decide that you want MS2 chromatograms, then you should do the following:
1. In "Settings > Transition Settings > Filter" add "y" and maybe "b" to the list of ion types.
2. In "Settings > Transition Settings > Full Scan" change "MS/MS filtering Acquisition Method" to "DIA" and change "Isolation Scheme" to probably "Results Only".

After doing this, Skyline then will add the MS2 transitions to all of your precursors. You will see these new transitions in the Targets tree.
Then, do:
Edit > Manage Results > Reimport
or maybe
File > Import > Results
to tell Skyline to extract chromatograms for everything.

There's a menu item "Edit > Refine > Associate Proteins" which you can use to assign all of your peptides to proteins using a FASTA file. You can use this menu item if you have done "View > Spectral Library > Add All".

When you do "File > Import > FASTA" into a blank document, Skyline gives you every protein that is in the FASTA file, because Skyline does not know about your peptide search results.

There's a button on the Start Page in Skyline "Import DIA Peptide Search". It sounds like that's the button that you should have used, if you have some DDA runs, and some DIA runs, and you want to use the DDA peptide search results to decide which peptides and transitions to use, and you want to extract chromatograms from the DIA runs.

I recommend the DIA tutorial, since I think it covers your exact scenario:

-- Nick
Brendan MacLean responded:  2018-10-28
If you want to do larger scale "discovery" DIA with Skyline, then you probably want to look at webinars 14 & 15:

Here also are recordings from this summer's DIA/SWATH course at ETH of me walking participants through library building from DDA search results and using the latest improvements to the File > Import > Peptide Search > DIA wizard:

The full course videos are here:

And the accompanying data sets are here:
Giovanna responded:  2018-10-30
Thank you for the quick and thorough responses! I was definitely missing something.
I've set transitions and peptide settings, built the library through peptide settings->library->build, then imported FASTA, then imported DIA files and finally refined the data removing duplicate peptides, and it works!
Three questions still remain:
1) if I want to look at the peptides' chromatograms in the DDA runs used for the library I start from "import DDA peptide search" and create the library from there. Nick, you've said that, in order to see MSMS chromatograms, I have to select b and y ions in transitions settings, I forgot to say that I've had already done that. At this point (when I'm just looking at my DDA runs) MSMS chromatograms still don't appear, even if I reimport the files. Indeed each peptide has an idopt but not a dopt value.
I can see MSMS chromatograms when importing DIA runs instead.

2) Is it better to set up centroided in full scan->MS1 / MSMS filtering or to define a mass analyzer and a resolving power? In the latter case, is it ok to set resolving power at 400 m/z if my acquisition range is 400-1200 m/z?
3) I've finally understood that dopt is "library dot product" in the report's features list. So I get can a report with both idopt and dopt values. I've noticed that low values of both parameters mean bad peptide matches since they show the agreement in peaks' intensity, RTs and shape with the expected ones. Do you suggest a threshold value for them? Can I rely just on these parameters to get right indentifications without manually check thousands of peaks or would I miss something?

Thank you again.

PS Brendan it would be amazing to attend that ETH course but I don't have fundings. Anyway I've become a huge fan of you all, you're so clever. Cheers
Nick Shulman responded:  2018-10-30
1. You should not ask Skyline to extract MS2 chromatograms from DDA data. The chromatograms will just be really long straight lines between the times where the precursor happened to get sampled. For DDA data, the only useful information that Skyline can give you is MS1 chromatograms.

2. When you tell Skyline the resolution at a particular m/z, Skyline applies the correct formula in order to figure out what the resolution is at other m/z's. That's why Skyline asks you whether it's a TOF or Orbitrap etc,-- they all have a different way in which resolution scales with m/z.
You will probably get better results if you choose "Centroided", since that applies the vendor centroiding algorithm, which is usually better than what Skyline does which is summing all of the profile points across an m/z channel whose width is determined by the resolution.

3. You might want to try adding decoys to your document and training an mProphet model. Then, Skyline will apply some weighting to the idotp and dotp values in order to choose peaks. Take a look at the Advanced Peak Picking tutorial:

-- Nick