DIA-NN .speclib support

support
DIA-NN .speclib support Tobi  2020-05-27
 

Dear Skyline Team,

could you please consider implementing support for DIA-NNs .speclib spectral libraries? Its a highly convenient tool for predicted libraries and much faster than Prosit.

https://github.com/vdemichev/DiaNN

With best regards,
tobi

 
 
Brian Pratt responded:  2020-05-28

Hi Tobi,

I had a quick read through the DiaNN documentation, I don't see any information on what that .speclib format looks like. It sounds like they can emit various other formats, though, so perhaps the problem is already solved of Skyline already deals with one of those. If you can provide an example of a .speclib file that would be helpful in assessing this.

Thanks,

Brian Pratt

 
Tobi responded:  2020-05-28

Dear Brian,

thanks for the fast reply. Please find attached a small spectral library on Pierce Retention time standard peptides (unlabeled) in .speclib and .tsv.

I know Skyline can somehow import .tsv but its not an option for large libraries due to ram usage (forced target list creation as side process). For that reason support for .speclib the same way as .blib would be awesome. DIA-NN has great potential for DIA, especially with skyline on the side for visualization.

In terms of size, speed, and adjustability DIA-NN might be preferably over Prosit in terms of predicted libraries, but it can analyse only DIA raw data.

Feel free to let me know if I can provide you with anything else.

Best,
tobi

 
Brian Pratt responded:  2020-05-28

Hi Tobi,

I was hoping that .speclib would be implemented as SQLite, but it's some other db format, so no luck there.

I think I read that they can export .msp spectrall library format, have you tried that? Skyline reads that directly.

Best,
Brian

 
Brendan MacLean responded:  2020-05-28

You might also try either building a .blib directly from the .tsv file (which we implemented in 20.1) or using File > Import > Assay Library with that .tsv file, if you just want to target everything in it. File > Import > Assay Library has received a lot of testing and benefited from many iterations of bug fixing from working with the Aebersold lab and most recently the Guo lab (former Aebersold lab member).

Unless we can get more information on the .speclib format, it seems it must be some custom binary format we are not aware of. Probably best to use one of the available interop formats.

--Brendan

 
Tobi responded:  2020-05-28

Hi Brian,

DIA-NN has experimental support for reading .msp, but exporting libraries as .speclib and .tsv seem the only option based on documentation and developers comments.

It is also said it should be technically possible to convert from .speclib or .tsv to .msp , but implementation would require some work so I do not expect this to happen soon from DIA-NNs side.

The .speclib seems roughly 12-times more space efficient than a Prosit .msp (including fragment filtering). Plus its own format for raw file conversion (.raw, .mzmL, .wiff to .dia) can reduce centroided mzMLs to half their size. Just the file formats and related processing speeds are surprisingly good and worth to take a look at.

Best,
tobi

 
Brendan MacLean responded:  2020-05-28

It looks like Matt Chambers has been silently working on this since I emailed him about handling it in BlibBuild.exe yesterday.

He just submitted a pull request on GitHub:

https://github.com/ProteoWizard/pwiz/pull/1097

So, it seems likely that there will be support in a Skyline-daily coming soon. You will need to go through the library build interface to convert from .speclib to .blib, our own SQLite based binary format, which we believe is relatively fast and space-efficient, and more approachable (because of the SQLite use) to other developers, than an entirely custom binary format (which we use for our .skyd files and .slc - library cache - files).

Keep an eye out for it in the Skyline-daily release notes. Thanks for the feedback.

--Brendan

 
Tobi responded:  2020-05-29

Hi Brendan,

thanks for the feedback, sounds great. Its always nice and sometimes a big deal to have everything at least somehow convertible.

Thank you, looking forward to it.

Best,
tobi

 
matt.chambers42 responded:  2020-05-29

DiaNN is, like you said, highly convenient to run, so it make sense to support it. Especially since I was able to just copy his C++ code into BiblioSpec. However, FYI, from my admittedly-not-very-thorough testing, it seems DIA-Umpire -> pseudo-DDA spectra -> DDA search produces a better library than DiaNN. Sometime this year we will have that pipeline implemented in a way that makes it even more highly convenient to use from within Skyline. :)

 
Tobi responded:  2020-06-01

Dear Matt,

thank you very much for your helpful comment. Please don't benchmark DIANN too seriously right know as libraries and search still contain extra nonsense-precursors. This will hopefully fixed soon, further increasing its speed and sensitivity.

Best,
tobi

 
Vadim Demichev responded:  2020-06-05

Matt, thank you so much for implementing .speclib support in ProteoWizard!
Brendan, would be very cool if .speclib could be imported in Skyline, thanks! I know a number of people are using Skyline for visualising things in conjunction with DIA-NN. SQLite-based input/output is definitely planned (I use SQLite to read diaPASEF data anyway), but not in the near future.

Matt, in terms of performance, I would think something must have been wrong with the settings (deep learning not used?). In LFQbench, for example, DIA-NN's library-free now yields 100k+ precursors per run (I think it's ~3x times higher than the original DIA-Umpire + DDA engine result). Mainly due to the fact that it is peptide-centric (PECAN idea) and the use of deep learning.

Tobi, working on that :) I am thinking about feeding the peptide characteristics into neural networks, so that they automatically take into account things like the fact that charge 1 peptides with lysines are very unlikely to be detected. If that does not work, will probably add an option to just eliminate them completely, as you suggested. This would definitely be an improvement, but I would not expect any drastic effects (FDR might decrease by ~10%-20%).

Vadim

 
Tobi responded:  2020-06-08

Thank you for implementing .speclib in Skyline daily,

however, it does not seem to work, at least not via Settings / Peptide Settings / both add and built library do not accept and list .speclib files.

There is a small test file in the third post from the top.

Could you please have a quick look at that or let me know if I do sth. wrong?

Best,
tobi

 
Brian Pratt responded:  2020-06-08

Hi Tobi,

You'll need to use Matt's work on BlibBuild to convert speclib to BiblioSpec .blib format, then give that to Skyline. There are only a few formats that Skyline reads directly.

I can see where it would be good if Skyline had logic to try BlibBuild automatically when asked to read a spectral library whose format is not recognized. I'll add a feature request for that.

Best Regards,

Brian Pratt

 
Tobi responded:  2020-06-08

Dear Brian,

it seems I was mislead by the recent patch notes Skyline-daily 20.1.1.155 "Spectral library building for the DiaNN specLib format."

Thank you for adding a request for direct support. I would assume the current way is not really accessible for non-coders? Direct support would be great, especially since both skyline and DIA-NN are software with GUIs specifically for uncomplicated and easy access.

Thank you and with best wishes,
tobi

 
Brian Pratt responded:  2020-06-08

You should be able to proceed with the tools you have., it's not very complicated.

https://skyline.ms/wiki/home/software/BiblioSpec/page.view?name=BlibBuild

 
matt.chambers42 responded:  2020-06-08

Normally, instead of adding it as a library, you'd import it as a "Peptide Search" (File -> Import -> Peptide Search). But because I forgot to add ".speclib" to the valid file extensions for library search result files within Skyline itself, at the moment you'd have to convert from speclib to blib by running BlibBuild from the command-line.

Brian meant that if opening a library directly failed because it's not a supported format, it would fallback to trying to import it with BlibBuild (as a "peptide search").

Brian/Brendan, shouldn't setting the BuildLibraryDlg to "All Files" disable the IsValidInputFile() validation of the file extensions? The "All Files" option seems completely superfluous otherwise.

 
Brendan MacLean responded:  2020-06-08

We don't really want to just call BlibBuild with whatever file the user wants to supply, but All Files allows the user to see the files in a folder, and maybe understand better why something they expect to see isn't showing up.

Probably, Matt, you should add a big comment in BlibBuild that reminds a developer to add a newly support file format to Skyline, or figure out a way to either test this is the case or build the list Skyline uses during the Bibliospec build... or something.

For the time being we don't add new formats that often, and we remain vulnerable to the mistake you made. It will be fixed in the next Skyline-daily, which can be relatively soon.

 
Tobi responded:  2020-06-09

Dear all,

thank you very much for the extensive effort. Looking forward to the direct support of .speclib since Skyline and DIA-NN are a really sweet combination and will be used as such by lots of people.

With best wishes,
tobi