Duplicate/repeated peptides

support
Duplicate/repeated peptides sandra maass  2015-07-03 00:28
 
Hi,

I'm currently working on a project where the "refine"-options really helps me a lot. Regarding this I have a question: What exactly is the difference between duplicate and repeated peptides?

Thank you in advance
Sandra
 
 
Brendan MacLean responded:  2015-07-03 00:47
Hi Sandra,
Great question.

The Edit > Refine > Remove Duplicate Peptides option will remove all peptides that appear multiple times in your document. This can be of some help in limiting your document to only unique peptides, but of course this is not checking the background proteome for uniqueness. So, even with this, you should make extra effort to ensure uniqueness within the sample you are measuring, if that is what you want to achieve.

The Edit > Refine > Remove Repeated Peptides option will remove all but the first instance of any peptide in your document so that there will be only one instance of each peptide in the document, with no guarantees whatsoever of uniqueness in your sample.

If your document contains peptides that appear multiple times, then the first option should result in fewer peptides than the latter, since the latter leaves 1 copy of every peptide, while the former removes all copies.

Thanks for posting your question to the support board. Good luck with your method design and refinement.

--Brendan
 
Ilker Sen responded:  2015-07-14 07:27
Hi Brendan,

I noticed in 3.1 "Remove Repeated Peptides" behaves differently than previous versions. Previously, when the same peptide appeared in different proteins, 1 instance of the peptide was kept in all those proteins. In 3.1, it's behaving more like Remove Duplicate Peptides option, only 1 instance is left in the entire list. Just wondering if this is a bug or not?

Cheers,
Ilker
 
Brendan MacLean responded:  2015-07-14 07:40
Hi Iiker,
I think you are mistaken. Remove repeated peptides has always removed all but the first appearance of a peptide in the document. Whereas, remove duplicate peptides has always removed all instances of any peptide that appears multiple times in the document.

If you can supply a document where it behaves differently in an older version, please do and then be specific about what you see changing, but I don't think we have changed anything in this area. Nor do I think the code for remove repeated peptides ever left more than a single copy of any peptide in the document.

--Brendan
 
Ilker Sen responded:  2015-07-14 08:28
I may be mistaken about the previous version. I attached the screenshots of the current behavior, which is exactly what you describe above. Is there any way to filter peptides for the desired outcome as in the screenshot? ie. leave only 1 copy of each peptide per protein.
 
Brendan MacLean responded:  2015-07-14 08:47
Hi Iiker,
There is no such function in Skyline, nor has there ever been. It is not usually such a big problem. Are you seeing this case a lot? In the screenshots you provide, you are actually showing peptide lists, and not proteins derived from FASTA sequences. If you are really creating your own peptide lists with your own names, as in the example, using Edit > Insert > Peptides, then you could simply use Excel and its Remove Duplicates function on your peptide and protein name columns, before you paste those into the Edit > Insert > Peptides form.

But maybe I am missing some broader case where actual FASTA sequence proteins contain multiple copies of peptides. No one has ever mentioned this before.

Thanks for your clarification.

--Brendan
 
Ilker Sen responded:  2015-07-14 09:03
I do see it often, although it is probably relevant to my specific application and not broadly applicable. For instance, we have two proteins in a sample that are homologous. Thus, we see many peptides that are shared between the proteins. I would like to retain the peptide under each of the protein entries in skyline so that the protein coverage remains intact, and in the future when we search for this particular protein in panorama, we see all of its matching peptides (even though some of those may belong to another protein in that sample).

I should mention that this is not critical; I would see this as a "nice to have" feature.

Thanks for your input,
Ilker
 
Brendan MacLean responded:  2015-07-14 10:04
Hi Iiker,
To clarify, you are saying that you see many cases where a single protein sequence contains the same peptide multiple times? My understanding is that the feature you are describing would only remove the second instance of any peptide that appears multiple times in a protein. It would otherwise leave all peptides that appear only once in any given protein. i.e. the Remove Repeated Peptides feature, but limited in scope to each protein, rather than applying document-wide (perhaps "Remove Repeated within Protein Peptides").

Do I have that right?

--Brendan
 
Ilker Sen responded:  2015-07-14 10:52
Yes, that's exactly right.
 
k valgepea responded:  2017-08-30 21:01
Hi Brendan!

It seems that the "Remove Duplicate Peptides" function does not work for some reason in my Skyline as it does not remove any peptides while "Remove Repeated Peptides" removes 200. Is that somehow possible?

Thanks!
 
clichti responded:  2017-08-31 09:17
That sounds impossible to me. I would expect that "Remove Duplicate Peptides" would always remove at least twice as many peptides as "Remove Repeated Peptides", since the former is supposed to remove all occurrences of peptides that appear more than once in the document, while the latter is supposed to leave the first occurrence and remove only subsequent occurrences.

If you could either post your document to this thread, if it is smallish or to

http://skyline.ms/files.url

We would be happy to take a closer look to understand the cause.

Thanks for reporting it to the Skyline support board.

--Brendan
 
k valgepea responded:  2017-08-31 17:53
Thanks for the quick reply!

I found out the reason for what I saw is probably a software bug: everything makes sense when you use Edit-Refine-Remove Repeated/Duplicate peptides but no peptides are removed when you go Edit-Refine-Advanced-Remove Duplicate Peptides and this is what I had used (for some reason I had not noticed that you can do straight from Refine). So probably a bug there so that ticking the box and going ok does not do anything (the repeated works there as well). I have attached a screenshot of the Skyline version I am using.

Cheers!
 
Brendan MacLean responded:  2017-08-31 19:22
Ha! Nice one. I was a little dubious that something so simple could make it by us for so long, but you are absolutely right.

Thankfully, the fix is also a one line change. As it turns out, even from Edit > Refine > Advanced you can remove duplicate peptides by checking both checkboxes. It is only when you check only "Remove duplicate peptides" that nothing happens.

Glad you found one of the workarounds. Thanks for reporting it so clearly. We will fix it in an upcoming release.

--Brendan