int32 overflow during mProphet Model Training Failure on 3.7.1.11357

support
int32 overflow during mProphet Model Training Failure on 3.7.1.11357 wbarshop  2017-09-11 13:37
 
Hello Skyline team,

After updating to the newest Skyline-Daily release, I'm having trouble training mProphet peak picking models on some of my DIA data. The resultant error only leaves me more puzzled, but in the end looks to be an int32 overflow.

First, the error:
[2017/09/11 13:14:51] Calculating peak group scores
...
... (truncated for space)
[2017/09/11 13:16:28] 100%
[2017/09/11 13:16:31] Error: Failed to create scoring model.
[2017/09/11 13:16:31] System.IO.InvalidDataException: Insufficient target peaks (240199 with 281448 decoys) detected at 15% FDR to c
ontinue training.
   at pwiz.Skyline.Model.Results.Scoring.MProphetPeakScoringModel.CalculateWeights(String documentPath, ScoredGroupPeaksSet targetTran
sitionGroups, ScoredGroupPeaksSet decoyTransitionGroups, Boolean includeSecondBest, Boolean nonParametricPValues, Double qValueCutoff,
 Double[] weights, Double& decoyMean, Double& decoyStdev, Boolean& colinearWarning) in c:\proj\pwiz_x64\pwiz_tools\Skyline\Model\Resul
ts\Scoring\MProphetScoringModel.cs:line 405
   at pwiz.Skyline.Model.Results.Scoring.MProphetPeakScoringModel.<>c__DisplayClass7.<Train>b__4(MProphetPeakScoringModel im) in c:\pr
oj\pwiz_x64\pwiz_tools\Skyline\Model\Results\Scoring\MProphetScoringModel.cs:line 255
   at pwiz.Common.SystemUtil.Immutable.ChangeProp[TIm](TIm immutable, SetLambda`1 set) in c:\proj\pwiz_x64\pwiz_tools\Shared\Common\Sy
stemUtil\Immutable.cs:line 201
   at pwiz.Skyline.Model.Results.Scoring.MProphetPeakScoringModel.Train(IList`1 targetsIn, IList`1 decoysIn, LinearModelParams initPar
ameters, Nullable`1 iterations, Boolean includeSecondBest, Boolean preTrain, IProgressMonitor progressMonitor, String documentPath) in
 c:\proj\pwiz_x64\pwiz_tools\Skyline\Model\Results\Scoring\MProphetScoringModel.cs:line 196
   at pwiz.Skyline.CommandLine.CreateScoringModel(String modelName, Boolean decoys, Boolean secondBest, Boolean log, Nullable`1 modelI
terationCount) in c:\proj\pwiz_x64\pwiz_tools\Skyline\CommandLine.cs:line 1461

=========================================================================
=========================================================================

After seeing this, I was curious what this sanity check was in pwiz.Skyline.Model.Results.Scoring.MProphetPeakScoringModel.CalculateWeights .

So, I checked it out on the sourceforce repo (maybe out of date? Not sure if this has the most recent -Daily updates):

// Better to let a really poor model through for the user to see than to give an error message here
if (truePeaks.Count*10*1000 < decoyPeaks.Count) // Targets must be at least 0.01% of decoys (still rejects zero)
    throw new InvalidDataException(string.Format(Resources.MProphetPeakScoringModel_CalculateWeights_Insufficient_target_peaks___0__with__1__decoys__detected_at__2___FDR_to_continue_training_, truePeaks.Count, decoyPeaks.Count, qValueCutoff*100));
if (decoyPeaks.Count*1000 < truePeaks.Count) // Decoys must be at least 0.1% of targets
    throw new InvalidDataException(string.Format(Resources.MProphetPeakScoringModel_CalculateWeights_Insufficient_decoy_peaks___0__with__1__targets__to_continue_training_, decoyPeaks.Count, truePeaks.Count));




From this, it seems that it must be the case that truePeaks.Count*10*1000 is evaluating to be less than decoyPeaks.Count -- curious as the output of the same line from the error at runtime shows the values of truePeaks.Count and decoyPeaks.Count to be 240199 and 281448, respectively.


Certainly 240119 is not less than 1/10000th of 281448. This looks like an integer overflow, as 240119*10*1000 is (2,401,190,000) greater than the int32 max of 2,147,483,647.

Would be happy to hear your thoughts!

Cheers,
William
 
 
Brendan MacLean responded:  2017-09-11 14:49
Hi William,
Nice diagnosis. I suspect converting to double before applying the math should work better:

if (((double)truePeaks.Count)*10*1000 < decoyPeaks.Count) // Targets must be at least 0.01% of decoys (still rejects zero)

I will try to get a fix released in a Skyline-daily soon. Until then, probably best to roll back to the prior release.

Thanks for working this out and posting the issue here.

--Brendan