Build spectral libraries using a large number of DDA files

support
Build spectral libraries using a large number of DDA files benoit fatou  2018-12-04
 

Hi all,

I am trying to analyze DIA data and then extract peak areas for peptides/proteins of interest.
I would like to use a spectral library I generated from a large number of DDA files (about 1300 files).
When I was building it into Skyline, I received a error message saying I have reached the maximal number of spectrum files which is 500.

I am contacting you to know if it possible to increase the capacity to Skyline to analyze a larger number of DDA files for spectral library generation.

Thank you very much for your help !

Best regards,
Benoit

 
 
Nick Shulman responded:  2018-12-04

We certainly could increase that limit.

I am not sure exactly why that limitation exists.
I think it might have been added in 2013 so that if you were using BilbioSpec's own spectrum sequence list format, you would get an error if you accidentally mixed up your column headers.

One way that you could work around this limitation is to create separate .blib files for subsets of your search results that each contain less than 500 files. Then, you can merge all of those .blib files together.

In Skyline, when you are building a spectral library, you can specify as input either peptide search result files, or you can specify other .blib files that you have already built, and they will all get merged together.
-- Nick

 
Brendan MacLean responded:  2018-12-04

It was actually added to protect against some cases we saw that ended up causing BiblioSpec to mistakenly interpret every spectrum as coming from a different file. For instance, pepXML frequently has the format:

<spectrum_query spectrum="example.00214.00214.1"...

Where "example" is the basename of the searched file (e.g. example.mzXML)

The trailing numbers are interpreted as start-scan.end-scan.charge-state. If BiblioSpec encounters this format in a place where it is expecting to only find the basename or filename of the source file, then it will end up thinking every spectrum match comes from a different file, as I think was actually the case for the original pepXML implementation when the source of the spectra might have actually been DTA files with one per spectrum.

This had undesirable consequences, that took a while to notice and then more time to work out the cause, in the case where we saw it. So, we implemented what we thought was a reasonable upper limit to help us flag the issue in the future.

It should be relatively easy to raise that limit to 2000, which will hopefully give us at least another 5 years before we see either issue again.

Thanks for posting to the Skyline support board. Good luck with your very large library.

--Brendan

 
benoit fatou responded:  2018-12-04

Thanks Nick an Brendan for you answers.

According to what you said and if I understand correctly, I need to create sub-spectral libraries using BiblioSpec with a maximum number of 500 DDA, and merged them into Skyline to generate the final spectral library.
However, I was wondering how long will it take for you to raise the number of files.

Thank you very much,
Best,
Benoit

 
benoit fatou responded:  2018-12-12

Hi all,

I am following up Nick's answer about the problem that I have of the generation of spectral library using BiblioSpec from a large number of DDA files.
I tried to analyze the first 500 raw files in BiblioSpec with my MaxQuant output but it did not work because of the maximum limit of 500 spectrum source files exceeded. I can try to split the MaxQuant search by analyzing 500 DDA files by 500 but it will take some time.
I was wondering if there is a fastest way to solve this problem, for example increase the maximum limit of files.

Thanks,
Best,
Benoit

 
Brendan MacLean responded:  2018-12-12

Hi Benoit,
We will try to get a fix out in Skyline-daily by early next week. I am at Duke University teaching this week. Sorry for the delay.

--Brendan

 
benoit fatou responded:  2018-12-12

Hi Brendan,

Thank you very much for your help.
Let me know when you will have it.

Best,
Benoit

 
Brendan MacLean responded:  2018-12-16

A change to increase the limit from 500 to 2000 will be in the next Skyline-daily, early next week.

https://github.com/ProteoWizard/pwiz/pull/383

Thanks for your feedback.

--Brendan