Importing out of memory

Importing out of memory lihaikuo  2018-04-17
Hi, I am trying to import results of more than 200 raw MS datas on Skyline, but it will cost RAM more than 30G, especially when the last 40 or 50 raw datas are being imported. And the computational speed is pretty slow. (At the beginning of importing, I can get a .skyd file in 20 minutes, but when the 170th raw data is imported, it will take more than 5 hours.)
Our computer currently has a RAM of 32G, so it can not run well.

I am using the latest version of Skyline, and here I do not want to seperate my files into groups to import.
I want to know how other users solve such problems.

Brendan MacLean responded:  2018-04-17
Hi Haikuo,
For an import this large, you are likely best off using SkylineRunner, which avoids consuming extra memory until the final step of joining all of the individual .skyd files for your 200 raw files. That is where memory consumption will become an issue, but each individual raw data file should import just fine producing a separate .skyd file with little increase in memory use.

Also, when using SkylineRunner, Skyline does not need to maintain the user interface or any undo information, which also reduces memory consumption.

Yes, when you start to approach your memory limit everything is going to slow down because the operating system actually starts swapping memory to disk. The only solution to that is to find a way to use less memory or to find a computer with more memory.

You can also use the --memstamp argument to SkylineRunner to make it output memory use with its logging information, giving us more visibility into your memory consumption over time if you send us the log.

What kind of experiment is this? How many transitions does your document contain? I would guess DIA and over 100,000 transitions, given the experiments I have done myself, but if you have only a few 1000 transitions, then something else may be wrong, and we would be very interested in understanding why memory consumption is so high. What kind of raw data files are they? (Thermo? SCIEX? Waters? etc)

You can find helpful starting scripts for processing large data sets with SkylineRunner in resources for the Skyline Tutorial Webinars on processing large-scale DIA:

Thanks for reporting your issue to the Skyline support board.

lihaikuo responded:  2018-04-17
Thanks a lot for your quick reply.

SkylineRunner is working and now it looks good.
I wonder which version is the SkylineRunner working on? Both Skyline and Skyline-daily are on my computer, as well as some old versions of Skyline--I did not uninstall them when a new version is downloaded.

Yes, I have more than 120,000 transitions. And I have added decoys so the total transition is over 250,000.
The raw datas are from Thermo Orbitrap Fusion. The average size of each raw data is around 800MB.

Nick Shulman responded:  2018-04-18
SkylineRunner.exe always runs the regular Skyline (which is currently Skyline 4.1).
If you want to run Skyline-Daily, then you need to download a different file called "SkylineDailyRunner.exe" (which you can download from the Skyline-Daily page).

SkylineRunner.exe finds Skyline by looking through your Start Menu in Windows.

You probably only have one version of Skyline and one version of Skyline-Daily on your machine. When you download a new version of Skyline, it usually replaces the previous version. It is actually quite difficult to install more than one version of Skyline on your computer.
lihaikuo responded:  2018-04-18
Thanks for your reply.

Actually I indeed have two Skyline versions on my computer--3.7 and 4.1.
When I try to open a .sky file, it is opened in the version 3.7, as shown in the attachment.
Everytime when I want to use the version 4.1, I have to firstly find Skyline.exe in Start Menu and then open the .sky file.

So how can I check whether my SkylineRunner is under versioin 4.1?

Nick Shulman responded:  2018-04-18
SkylineRunner.exe actually looks on your Start Menu to figure out where Skyline has been installed.

In your Start Menu, you should have a folder called "Skyline".
SkylineRunner.exe is going run the first program called "Skyline".

I am not sure what the easiest way to figure out which version of Skyline is being executed by SkylineRunner.
If you tell SkylineRunner to save a .sky file, then you can open the .sky file up in a text editor such as Notepad, and the first line will tell you which version of Skyline created the .sky file.
Brendan MacLean responded:  2018-04-18
Just use SkylineRunner to open a file (--in) and save it (--save). Then open the .sky file in a text editor and look at what version of Skyline saved the file.
lihaikuo responded:  2018-04-18
Thanks you very much.

Here I open the .sky file in Notepad. And the first 3 lines shows:

<?xml version="1.0" encoding="utf-8"?>
<srm_settings format_version="3.73" software_version="Skyline (64-bit)">
  <settings_summary name="Default">

So I am confused which version it is.

Nick Shulman responded:  2018-04-18
That's Skyline version 4.1.

software_version="Skyline (64-bit)"

If it were Skyline-Daily, the third number in the version number would be a "1" instead of a "0" (e.g.
lihaikuo responded:  2018-04-24
Thanks a lot.

After importing DIA data, we usually do the procedure below before we reintegrate and train:

Advanced--result--min peak found ratio=0.01--remove empty peptides and empty proteins.
Advanced--document--min transition precursor=6--remove empty peptides and empty proteins.

Can these steps be worked on SkylineRunner? It is still very slow for every step above.

Brendan MacLean responded:  2018-12-27
Kind of late on the reply, I know, but, yes, we are going to add refinement to the command-line interface. You can track this issue.

We expect to be able to release a Skyline-daily with this functionality in January.

Thanks for your feedback.