Failure to Import Large Spectral Library File Due to Insufficient Memory?

support
Failure to Import Large Spectral Library File Due to Insufficient Memory? inewman1  2023-04-11 17:43
 

Hello,
I'm trying to import a fairly large (~10 gigabytes with slightly more than 20 million precursor peptides) spectral library generated in silico and formatted as a .blib file just like the kind Skyline's built-in Prosit API generates. Anyhow, when I go to load this library, Skyline successfully builds the .slc binary file only to throw the error shown in the screenshot I've attached. For context, Skyline works flawlessly when on a simple toy library containing a few thousand peptides, which, coupled with the error message, leads me to believe this is some sort of memory issue; is there anyway to overcome this, whether it be changing Skyline's settings or by somehow (further) compressing the .blib file?

P.S.
The spectra are already compressed using zlib as per the .blib file format specifications:

https://raw.githubusercontent.com/ProteoWizard/pwiz/master/pwiz_tools/BiblioSpec/tests/reference/tables.check

 
 
Nick Shulman responded:  2023-04-11 18:32
Can you send us that file "lib.blib"?
You can upload the file here:
https://skyline.ms/files.url

I think that error might be happening because Skyline is trying to put more than a few hundred million items into the same list in memory.
This limit is not really related to the amount of memory in the computer. In the C# programming language that Skyline was written in, lists are not allowed to contain more than a few hundred million entries, so, in the places where there might be more things than that, we need to make sure Skyline spreads the entries across multiple lists.

After we see your library it will probably be straightforward for us to fix this.
It is always very informative for us to see very large datasets like this and we usually find many more things that can be improved in Skyline.
-- Nick
 
inewman1 responded:  2023-04-12 09:51
Hello Nick,
Thanks for the rapid response. I went ahead and uploaded a copy of the lib.blib file (renamed to blib-lib.blib) earlier this morning so hopefully it's now accessible to you.
Thanks,
Ian
 
Nick Shulman responded:  2023-04-12 09:53
Thank you for uploading that .blib file. I will try to figure out what is going on.

I see that your library has 20 million rows in the "RefSpectra" table. I would have expected that number of entries to easily fit inside of a C# list, so I am not sure what exactly is causing this failure.

By the way, I see that many of the modified peptide sequences in the library contain "(UniMod:121)".
Skyline expects modified sequences in libraries to be mass offsets inside of square brackets. So, since "(Unimod:121)" is the GlyGly modification, that should be replaced by "[+114.04927]".
-- Nick