Recovery of Skyline "Import Results" process after system crash

support
Recovery of Skyline "Import Results" process after system crash tryphoncosinus  2023-03-21 14:16
 

Hello,

I am processing DIA search on a large batch using a high end server. All went fine but at a certain point during result import of pepXML/mzML files, the system memory went full and a crash occured.

Is there a way in Skyline to recover a session so that it continues the import process after the last imported result file ?

Thank you.

 
 
Nick Shulman responded:  2023-03-21 14:37
When you do "File > Import > Peptide Search", that is approximately the same as:
1. Go to:
Settings > Peptide Settings > Library
and press the "Build" button and build a spectral library
2. Use the menu item:
File > Import > FASTA
to add peptides to your document
3. Use the menu item:
File > Import > Results
to tell Skyline to extract chromatograms from your raw files.

My guess would be that it is not worth trying to recover the partial progress that Skyline made with your previous attempt. Typically, before the computer completely runs out of memory, Skyline slows down an enormous amount so that tasks take thousands of times longer to complete than they would on a computer that was not low on memory. If you can address the problem which caused Skyline to run out of memory, you would be able to complete the whole process of importing all the results in less than time than it would take to figure out how to pick up where you left off.

One avoidable thing that can cause Skyline to run out of memory is asking Skyline to import from too many files at once.
When you do "File > Import > Results", or on the "Extract Chromatograms" page of the "Import Peptide Search" wizard, there is a dropdown at the bottom of the dialog: "Files to import simultaneously".
The choices for how many files to import simultaneously are "One at a time", "Several" or "Many".
If you ran out of memory with that set to "Many", things might go much faster if you change that to "Several" or "One at a time".
-- Nick
 
tryphoncosinus responded:  2023-04-03 05:49
I loaded my sky project and I set the required parameters in peptide settings. I did the above steps to generate a blib file.

Also I created a more powerful Windows 11 VM with 92 cores and 750 GB RAM. This VM is stable.

First try.
I used "Import DIA peptide search" to load my data (blib, mzML and FASTA) setting all required parameters with "Files to import simultaneously" set to Many. Around 260 mzML files were imported before the system crashed. Since the system crashed, there was no possibility to update (save) my sky project.

Second try.
I did again the same thing with "Files to import simultaneously" set to Several. This time, it took around 5 hours before to see the "Importing Results" window; strange. I did not realize immediately that the number of mzML to import in this windows did not match the total number I ordered. Around 90 mzML files were imported then I decided to stop the process thinking I forgot many files to import by mistake.

Third try.
I deleted all tmp files. I did again the same thing with "Files to import simultaneously" set to Several. I took care to include all mzML I wanted to import. This time, it took around 6 hours before to see the "Importing Results" window. When the "Importing Results" appeared, I checked the number of mzML in the left list that was again less than in Second try.

1) Against all odds, it seems that Skyline remembers the number of mzML already imported. I am confused about what happened and the validity of the computing process through these tries, specifically because you mentioned "it is not worth trying to recover the partial progress that Skyline made". Have you got any explanation ?
2) How to restart the computing process including the whole mzML file collection I want to analyze ?

Due to time processing, I prefer 1) way of doing but validity should be confirmed.

I thank you for your help.
 
Nick Shulman responded:  2023-04-03 10:36
When Skyline finishes extracting chromatograms from a mass spec data file, Skyline saves the chromatograms in a file whose filename starts with the name of the mass spec data file with the filename extension ".skyd".
When Skyline finishes extracting chromatograms from all of the mass spec data files, Skyline combines all of the individual .skyd files into a single .skyd file and deletes all of the individual .skyd files.

When you tell Skyline to extract chromatograms, Skyline always checks for the existence of those individual .skyd files, and, if Skyline sees them, Skyline will skip ahead to extract chromatograms from only those files which did not already have .skyd files.

-- Nick
 
tryphoncosinus responded:  2023-04-20 15:25
Indeed, Skyline continued the "Importing Results" process up to the end.

At the end, an error window appeared : Error performing inpage operation. Not sure what does it mean.

After saving the project, I got a sky file weighting 56 GB. I see the graphical results.

Now it is time to export all the analysis results : File>Export>Report
Sadly, after long computing time, I got a 2-bytes csv file that contains ... nothing.
I did many tries and I am not able to get populated csv table.

I have no idea on what to do to get the csv file with expected data.

Thank you for your help.
 
Nick Shulman responded:  2023-04-20 16:28
As far as we know, there is no limit to the size of a .csv file that you can export as a report from Skyline.
Most text editors would not be able to handle a text file of that size, so I often recommend a program called "emeditor" for looking at really big text files:
https://www.emeditor.com/

If you would like to learn more about making your own custom reports, you should look at the Custom Reports tutorial:
https://skyline.ms/wiki/home/software/Skyline/page.view?name=tutorial_custom_reports

If you would like to compare groups of your samples to each other (e.g. healthy vs diseased), you should look at the Group Comparison tutorial:
https://skyline.ms/wiki/home/software/Skyline/page.view?name=tutorial_grouped

I do not know what exactly what "error performing inpage operation" means, but it sounds like it could be caused by either a hard drive or a network error.
-- Nick