Unusual behaviour in limiting number of peptides per protein Liyan Chen  2022-07-12 03:44
 

Hi Skyline developers,

I am using Skyline version 21.2.0.425 for DIA files and typically add all library peptides into the document. Recently we have decided that our data analysis works just fine with maximum of 10-20 peptides per protein, and additional peptides only work to slow down file import. Thus I setup a blank document with these settings update before adding library peptides to document:
Peptide Settings>Library: Limit 20 peptides per protein
Peptide Settings>Filter: Auto select all matching peptides

I still get more than 20 peptides for some proteins whether I select "Pick peptides matching: Library" or "Pick peptides matching library and filter". The only way I can get the limit of nPeptide/protein enforced is to go to "Refine>Advanced" and tick "Auto-select peptides". This refinement takes some time to run and results in repeating sequences within proteins (but still unique to their respective proteins) being introduced as duplicate peptides, and the rest of the peptides within the top 20 rank going missing.

In this example using rank 1 GTYSTTVTGR in the protein APOA_Human, only the first instance of this sequence is present in the document when no limit on nPeptides/protein is applied. Ranks 2-20 are occupied by other peptide sequences, with many more other peptides. At this point I had checked for duplicate/repeated peptides in the document and found none. However, when the limit of 20 peptides is enforced using "Refine", each occurrence of GTYSTTVTGR in the protein is re-ranked and counted as separate peptides.

How do I get Skyline to only keep one instance of GTYSTTVTGR as rank 1 and the other peptides ranked 2-20 in the document? Is there also a way to limit the peptides when the library peptides are being added to document, instead of adding all and then removing excess peptides?

I'm uploading the document "20220712_template.sky.zip" on the filebox.

Best regards,
Liyan

 
 
Nick Shulman responded:  2022-07-12 08:40
For all of the Proteins, Peptides and Precursors in the Targets tree, Skyline keeps track of whether you made a change to the children that the item has, or whether Skyline is managing the children.
You can see what state a protein is in by clicking the inverted triangle which appears when you hover the mouse to the right of a protein name in the Targets tree.
In the child picker that appears, there is a magic wand icon. If that magic wand is selected, then Skyline is responsible for managing the children. If it is unselected, then Skyline will not choose different peptides when the filter options change.

If you do:
Refine > Advanced
and then choose "Auto select all peptides", then it will turn on the magic wand for all of the proteins in the document.

The reason that it takes so long to make a change like that is that Skyline is trying to figure out which peptides are unique. Things will go a lot faster for you if you go to "Settings > Peptide Settings > Digestion" and change "Enforce uniqueness by" to "None".

Yes, it definitely looks silly that Skyline gives that protein 19 copies of the same peptide because that sequence happens to appear that many times within the protein sequence. We have known that this duplicating repeated peptide behavior was sub-optimal for a long time, but I don't think we have ever seen an example as extreme as this. I will ask around and see if we can fix this.

In Skyline-Daily, we have added new features related to assigning peptides to proteins. These features were implemented in order to address some problems that come up when the same peptide can be found in multiple proteins, but they might also help with your scenario here.
You can install Skyline-Daily from here:
https://skyline.ms/project/home/software/Skyline/daily/begin.view?

In the new Skyline-Daily, you might find that the "Refine > Associate Proteins" menu item does a better job of putting peptides where you want them to be.
When I try using the new Associate Proteins feature in Skyline-Daily on your Skyline document, I seem to end up with some proteins that have more than 20 peptides, so I am thinking there must be some bugs in the Limit Peptides Per Protein feature.

-- Nick