DIA-NN .speclib support Tobi  2020-05-27
 

Dear Skyline Team,

could you please consider implementing support for DIA-NNs .speclib spectral libraries? Its a highly convenient tool for predicted libraries and much faster than Prosit.

https://github.com/vdemichev/DiaNN

With best regards,
tobi

 
 
Brian Pratt responded:  2020-05-28

Hi Tobi,

I had a quick read through the DiaNN documentation, I don't see any information on what that .speclib format looks like. It sounds like they can emit various other formats, though, so perhaps the problem is already solved of Skyline already deals with one of those. If you can provide an example of a .speclib file that would be helpful in assessing this.

Thanks,

Brian Pratt

 
Tobi responded:  2020-05-28

Dear Brian,

thanks for the fast reply. Please find attached a small spectral library on Pierce Retention time standard peptides (unlabeled) in .speclib and .tsv.

I know Skyline can somehow import .tsv but its not an option for large libraries due to ram usage (forced target list creation as side process). For that reason support for .speclib the same way as .blib would be awesome. DIA-NN has great potential for DIA, especially with skyline on the side for visualization.

In terms of size, speed, and adjustability DIA-NN might be preferably over Prosit in terms of predicted libraries, but it can analyse only DIA raw data.

Feel free to let me know if I can provide you with anything else.

Best,
tobi

 
Brian Pratt responded:  2020-05-28

Hi Tobi,

I was hoping that .speclib would be implemented as SQLite, but it's some other db format, so no luck there.

I think I read that they can export .msp spectrall library format, have you tried that? Skyline reads that directly.

Best,
Brian

 
Brendan MacLean responded:  2020-05-28

You might also try either building a .blib directly from the .tsv file (which we implemented in 20.1) or using File > Import > Assay Library with that .tsv file, if you just want to target everything in it. File > Import > Assay Library has received a lot of testing and benefited from many iterations of bug fixing from working with the Aebersold lab and most recently the Guo lab (former Aebersold lab member).

Unless we can get more information on the .speclib format, it seems it must be some custom binary format we are not aware of. Probably best to use one of the available interop formats.

--Brendan

 
Tobi responded:  2020-05-28

Hi Brian,

DIA-NN has experimental support for reading .msp, but exporting libraries as .speclib and .tsv seem the only option based on documentation and developers comments.

It is also said it should be technically possible to convert from .speclib or .tsv to .msp , but implementation would require some work so I do not expect this to happen soon from DIA-NNs side.

The .speclib seems roughly 12-times more space efficient than a Prosit .msp (including fragment filtering). Plus its own format for raw file conversion (.raw, .mzmL, .wiff to .dia) can reduce centroided mzMLs to half their size. Just the file formats and related processing speeds are surprisingly good and worth to take a look at.

Best,
tobi

 
Brendan MacLean responded:  2020-05-28

It looks like Matt Chambers has been silently working on this since I emailed him about handling it in BlibBuild.exe yesterday.

He just submitted a pull request on GitHub:

https://github.com/ProteoWizard/pwiz/pull/1097

So, it seems likely that there will be support in a Skyline-daily coming soon. You will need to go through the library build interface to convert from .speclib to .blib, our own SQLite based binary format, which we believe is relatively fast and space-efficient, and more approachable (because of the SQLite use) to other developers, than an entirely custom binary format (which we use for our .skyd files and .slc - library cache - files).

Keep an eye out for it in the Skyline-daily release notes. Thanks for the feedback.

--Brendan

 
Tobi responded:  2020-05-29

Hi Brendan,

thanks for the feedback, sounds great. Its always nice and sometimes a big deal to have everything at least somehow convertible.

Thank you, looking forward to it.

Best,
tobi

 
matt.chambers42 responded:  2020-05-29

DiaNN is, like you said, highly convenient to run, so it make sense to support it. Especially since I was able to just copy his C++ code into BiblioSpec. However, FYI, from my admittedly-not-very-thorough testing, it seems DIA-Umpire -> pseudo-DDA spectra -> DDA search produces a better library than DiaNN. Sometime this year we will have that pipeline implemented in a way that makes it even more highly convenient to use from within Skyline. :)

 
Tobi responded:  2020-06-01

Dear Matt,

thank you very much for your helpful comment. Please don't benchmark DIANN too seriously right know as libraries and search still contain extra nonsense-precursors. This will hopefully fixed soon, further increasing its speed and sensitivity.

Best,
tobi

 
Vadim Demichev responded:  2020-06-05

Matt, thank you so much for implementing .speclib support in ProteoWizard!
Brendan, would be very cool if .speclib could be imported in Skyline, thanks! I know a number of people are using Skyline for visualising things in conjunction with DIA-NN. SQLite-based input/output is definitely planned (I use SQLite to read diaPASEF data anyway), but not in the near future.

Matt, in terms of performance, I would think something must have been wrong with the settings (deep learning not used?). In LFQbench, for example, DIA-NN's library-free now yields 100k+ precursors per run (I think it's ~3x times higher than the original DIA-Umpire + DDA engine result). Mainly due to the fact that it is peptide-centric (PECAN idea) and the use of deep learning.

Tobi, working on that :) I am thinking about feeding the peptide characteristics into neural networks, so that they automatically take into account things like the fact that charge 1 peptides with lysines are very unlikely to be detected. If that does not work, will probably add an option to just eliminate them completely, as you suggested. This would definitely be an improvement, but I would not expect any drastic effects (FDR might decrease by ~10%-20%).

Vadim

 
Tobi responded:  2020-06-08

Thank you for implementing .speclib in Skyline daily,

however, it does not seem to work, at least not via Settings / Peptide Settings / both add and built library do not accept and list .speclib files.

There is a small test file in the third post from the top.

Could you please have a quick look at that or let me know if I do sth. wrong?

Best,
tobi

 
Brian Pratt responded:  2020-06-08

Hi Tobi,

You'll need to use Matt's work on BlibBuild to convert speclib to BiblioSpec .blib format, then give that to Skyline. There are only a few formats that Skyline reads directly.

I can see where it would be good if Skyline had logic to try BlibBuild automatically when asked to read a spectral library whose format is not recognized. I'll add a feature request for that.

Best Regards,

Brian Pratt

 
Tobi responded:  2020-06-08

Dear Brian,

it seems I was mislead by the recent patch notes Skyline-daily 20.1.1.155 "Spectral library building for the DiaNN specLib format."

Thank you for adding a request for direct support. I would assume the current way is not really accessible for non-coders? Direct support would be great, especially since both skyline and DIA-NN are software with GUIs specifically for uncomplicated and easy access.

Thank you and with best wishes,
tobi

 
Brian Pratt responded:  2020-06-08

You should be able to proceed with the tools you have., it's not very complicated.

https://skyline.ms/wiki/home/software/BiblioSpec/page.view?name=BlibBuild

 
matt.chambers42 responded:  2020-06-08

Normally, instead of adding it as a library, you'd import it as a "Peptide Search" (File -> Import -> Peptide Search). But because I forgot to add ".speclib" to the valid file extensions for library search result files within Skyline itself, at the moment you'd have to convert from speclib to blib by running BlibBuild from the command-line.

Brian meant that if opening a library directly failed because it's not a supported format, it would fallback to trying to import it with BlibBuild (as a "peptide search").

Brian/Brendan, shouldn't setting the BuildLibraryDlg to "All Files" disable the IsValidInputFile() validation of the file extensions? The "All Files" option seems completely superfluous otherwise.

 
Brendan MacLean responded:  2020-06-08

We don't really want to just call BlibBuild with whatever file the user wants to supply, but All Files allows the user to see the files in a folder, and maybe understand better why something they expect to see isn't showing up.

Probably, Matt, you should add a big comment in BlibBuild that reminds a developer to add a newly support file format to Skyline, or figure out a way to either test this is the case or build the list Skyline uses during the Bibliospec build... or something.

For the time being we don't add new formats that often, and we remain vulnerable to the mistake you made. It will be fixed in the next Skyline-daily, which can be relatively soon.

 
Tobi responded:  2020-06-09

Dear all,

thank you very much for the extensive effort. Looking forward to the direct support of .speclib since Skyline and DIA-NN are a really sweet combination and will be used as such by lots of people.

With best wishes,
tobi

 
f capuano responded:  2021-05-18

Hi All,

Very useful thread.

As a follow up to this post I have a question about the library generated by DIA-NN 1.7.15 and its use in combination to Skyline.

I want to use a library generated using DIA-NN to confirm the IDs of my target peptides. In order to do that I have searched 4 data files using DIA-NN and imported the library tsv file on Skyline as an assay library. Then, I have imported as results the same files used for generating the library to extract target peptide peaks.

I observe that the RT for my target peptides on the library differ from the RT of the peaks selected by Skyline. By comparing the transitions ranking and ppm the peaks selected by Skyline seem to be correct. How do I reconcile the difference in RT? Is it possible I am using a library that is a predicted one and not experimental?

Unfortunately I have no HL that I can use to pin down RT for my target peptides.

Any advice highly appreciated.

Floriana

 
Vadim Demichev responded:  2021-05-18

Hi Floriana,

DIA-NN saves 'predicted iRT' retention times in the library. That is not the experimental times, but experimental times translated to some reference scale. To save experimental times, please use the '--out-measured-rt' option. Another workaround is to map library entries to the entries in DIA-NN's main report, and get RT values from there. Another option is to look at peptide spectra using the --vis command in DIA-NN (please see the readme file), which basically exports chromatograms for the selected peptides and their fragments.

Best,
Vadim

 
Brendan MacLean responded:  2021-05-18

Hi Vadim,
Great to have the lead author of the DIA-NN paper on this thread. One thing we have done with other tools like EncyclopeDIA and OpenSWATH is to use the actual integration boundaries they came up with and not just the detection RT (presumably the apex or central retention time of the integrated peak). If there were a way to get these from DIA-NN, we could do the same for it. Skyline now has a checkbox in its library setup form for whether integration boundaries in the library should be applied to the data in Skyline.

It is obviously extremely helpful to have the measured RT at the center or apex of the peak, but having the integrated range can give a researcher an even clearer idea of how the tool came up with the peak areas it did.

Thanks for considering the ideal integration between DIA-NN and Skyline for visualization of your results.

--Brendan

 
Vadim Demichev responded:  2021-05-18

Hi Brendan,

Yes, in recent versions there are RT.Start and RT.Stop columns in the main report which correspond to the boundaries DIA-NN uses for quant.

Best,
Vadim

 
Brendan MacLean responded:  2021-05-18

Great news. I will make sure Matt starts storing those in the appropriate location in our BLIB format libraries so that they can be used by Skyline.

Thanks for posting to this support board.

--Brendan

 
Vadim Demichev responded:  2021-05-18

Oh, sorry, I misunderstood, these are only saved in the main report, not in the spectral library. Can also save in the library? I guess it might be a good idea anyway.

Vadim

 
Brendan MacLean responded:  2021-05-18

Well, let us know when and where to find them, and we will make sure DIA-NN users seeking Skyline visualization support benefit from their availability.

Thanks again.

 
matt.chambers42 responded:  2021-05-18

Yes I was just about to ask about it being in the library. We do currently read the .speclib file. Is the main report actually suitable to use as a replacement? Is that different than the TSV version of the speclib file?

 
Vadim Demichev responded:  2021-05-18

Many thanks Brendan! I will most likely implement saving the RT boundaries in the next version.

 
Vadim Demichev responded:  2021-05-18

Hi Matt, no, the main report does not contain spectra. But it can be used to supplement the information stored in the .tsv library: just need to match the pairs [file name - precursor name] between the report and the library (easy to do in R).

 
f capuano responded:  2021-05-20

Hi Vadim,

Thank you for suggesting the experimental RT export it works as a temporary fix. Exciting to see the ID mark aligning with the peak selected by Skyline :)

Thanks to all of you for looking into this. It will be very useful to have all the library info available in a file to upload directly in Skyline.

 
Juan C. Rojas E. responded:  2021-09-24

Hi all,

I would like to revive this thread with the following issue: I was trying to create a spectral library within Skyline (Settings -> Peptide Settings -> Library -> Build...) with the .speclib output of DIANN and I got an error about the .speclib version that is supported (image attached).

From the support threads my understanding is that creating a spectral library in .blib Skyline format from a DIANN .speclib format should be possible now? I can try to find a workaround with other outputs of DIANN, but I was wondering if this is a version incompatibility issue or am I doing something wrong?

I am using Skyline-daily 21.1.1.223 and DIANN 1.8

Sincerely,
JC

 
Brian Pratt responded:  2021-09-24

That sounds like DIANN has changed something in their .speclib format, and BiblioSpec is being appropriately cautious about it. Can you provide an example file?

Best regards,
Brian Pratt

 
Juan C. Rojas E. responded:  2021-09-24

Sure thing. I just uploaded it to the File Sharing dropbox with the name of "ubi_ptm_fig4_spectral_library.speclib".

Would you need anything else or is that enough?

JC

 
Brian Pratt responded:  2021-09-24

That's probably sufficient - we'll look into this, thanks.

  • Brian
 
matt.chambers42 responded:  2021-09-24

Please share the speclib.tsv as well. From what I can tell (https://github.com/vdemichev/DiaNN/commit/f2c8a78f46520b763c99d8be6cb98504726310f6#commitcomment-55711150) DiaNN has gone closed-source since I added support for reading its proprietary speclib binary format. Since I just copied from DiaNN's C++ source code to read that format, this may be a problem for maintaining that reader with a reasonable amount of effort. IIRC I chose the speclib format because it had spectral peaks and the tsv did not, but I'm not sure that's still the case. If it is, then I may need support from Vadim to tell me which fields are new/changed in the new speclib version(s) so I don't have to invest a lot of time reverse engineering it myself.

 
Juan C. Rojas E. responded:  2021-09-24

Hi Matt,

I just uploaded the corresponding "ubi_ptm_fig4_spectral_library.tsv". That is what I am currently using to create a parser to import and assay library.

Thanks a lot for looking into this.
JC