Issue 939: Off-by-one in Enzyme.GetMatches() dealing with protein C-terminus

issues
Status:closed
Assigned To:Guest
Type:Defect
Area:Skyline
Priority:3
Milestone:23.1
Opened:2023-02-19 15:23 by Brendan MacLean
Changed:2023-02-19 15:54 by Brendan MacLean
Resolved:2023-02-19 15:54 by Brendan MacLean
Resolution:By Design
Closed:2023-02-19 15:54 by Brendan MacLean
2023-02-19 15:23 Brendan MacLean
Title»Off-by-one in Enzyme.GetMatches() dealing with protein C-terminus
Assigned To»Brian Pratt
Type»Defect
Area»Skyline
Priority»3
Milestone»23.1
If you set Peptide Settings - Filter - Exclude N-terminal AAs to 0, and then past the following FASTA into Skyline:

>sp|P01880|IGHD_HUMAN Immunoglobulin heavy constant delta OS=Homo sapiens OX=9606 GN=IGHD PE=1 SV=3
RSLEVSYVTDHGPMK

You get the following in the Targets view:
>sp|P01880|IGHD_HUMAN
    R.SLEVSYVTDHGPMK.- [2, 15]

Instead of what the expected:

>sp|P01880|IGHD_HUMAN
    R.SLEVSYVTDHGPM.K [2, 14]

Skyline does not allow cleaving of the C-terminal AA from the protein because of the code in Enzyme.GetMatches()

if (_cleavageC != null && startat < len - 1 && // Never matches the end

This was discovered because it is in conflict with DIA-NN results. I am posting it as a clear issue report in case there is some reason not to change this to "startat < len" and remove the comment. You can see that the N-terminal AA can be cleaved off.

The code for an N-terminal protease does not have the same issue (thankfully). If I define an Enzyme that cleaves at SM and I allow 1 missed cleavage, I can get:

>sp|P01880|IGHD_HUMAN
    R.SLEVSYVTDHGPM.K [2, 14] (missed 1)

2023-02-19 15:54 Brendan MacLean
resolve as By Design
Statusopen»resolved
Assigned ToBrian Pratt»Brendan MacLean
Doh! That is not right. It seems like DIA-NN must be at fault. I got flipped around.

2023-02-19 15:54 Brendan MacLean
close
Statusresolved»closed
Assigned ToBrendan MacLean»Guest