import of parquet spectral library from DIANN 2.0 and above into skyline dkueltz  2025-04-15 22:52
 

Since version 2.0 DIA-NN no longer generates speclib files for its spectral libraries but instead only parquet files. I used to be able to import the speclib libraries into Skyline but since they are no longer generated by DIA-NN I wonder how I can import the parquet libraries into Skyline. The direct link from DIA-NN to Skyline (Skyline button in DIA-NN) never worked for me but importing the DIA-NN generated spectral libraries worked well until DIA-NN version 2.0 when the speclib libraries are no longer generated. The DIA-NN developers told me that Skyline will soon have parquet import capability - is this already a function that is available in Skyline Daily? If, so where can I find it?
Thanks much,
Dietmar

 
 
Mike MacCoss responded:  2025-04-15 23:30

Hi Dietmar,
Yes, Skyline Daily version 25.1 supports import of DIA-NN parquet files. If you start Skyline daily it should ask you to upgrade.

There have also been significant performance improvements to this import. The easiest way to do this is to File > Import > Peptide Search. Point the Spectral Library UI to the parquet.skyline.speclib file. By default DIA-NN calls this report-lib.parquet.skyline.speclib. This file gets written if you have MBR turned on or --reanalysis from the command line.

DIANN-Import UI

Hope this helps,
Mike

 
dkueltz responded:  2025-04-17 10:15

Thanks much, Mike, for the helpful reply! I think in the initial version 2.0 of DIA-NN the speclib files, which were generated in version 1.92 were apparently no longer generated, only the parquet files. I just downloaded DIA-NN version 2.1 and in that version the speclib files are generated again. At any rate, it works well now. Thanks for all your work on Skyline and Panorama - these are key tools for our proteomics workflows, along with DIA-NN.

 
Todd responded:  2025-05-23 08:25

Hello,

This thread is topical to an issue I've had with library building support with DIA-NN 2.1. As Dietmar suggested, now that Skyline 2.1 creates the speclib format again, Skyline 25.1 recognizes this library file, and I can build the Blib library. However, the peptide RT values annotated in the Skyline library are iRT scale, which are taken directly from the DIA library, and not from the main report, which contains values for both RT and iRT. Should Skyline libraries be using iRT or RT values?

When I use this library to analyze files, annotation of the RT corresponding to the library matches is absent. In this experiment, I didn't setup a RT predictor because I didn't have many targets. Is this required for proper peak picking when the library has iRT?

Best,
Todd

 
Nick Shulman responded:  2025-05-23 08:42
Todd,

Can you send us your Skyline document?
In Skyline, you can use the menu item:
File > Share
to create a .zip file containing your Skyline document and supporting files including extracted chromatograms and spectral libraries.

Files which are less than 50MB can be attached to these support requests. You can always upload larger files here:
https://skyline.ms/files.url

In a .blib spectral library, retention time values can be found in the "RefSpectra" table as well as the "RetentionTimes" table. I would hope that the values in the "RetentionTimes" table would be actual retention time values corresponding to the files that they are supposed to be from. The retention time values in the "RefSpectra" table are not used by Skyline and could be in an arbitrary scale and would not cause any problems for Skyline.
-- Nick
 
Todd responded:  2025-05-23 09:09
Hi Nick,
Thank you for the quick reply. Please see attached Skyline document.
Todd
 
Nick Shulman responded:  2025-05-23 09:28
Todd,

I see that you used "File > Import > Assay Library" to import a TSV file into Skyline.
The usual way to get DIA-NN results into Skyline would be to use the "File > Import > Peptide Search" menu item and then point Skyline at a .speclib file.
When you import DIA-NN search results that way, Skyline uses the peak boundaries that DIA-NN chose.

I am not sure exactly what you get when you do "File > Import > Assay Library" (I'm not as familiar with that feature in Skyline).
-- Nick
 
Todd responded:  2025-05-23 09:55
Hi Nick,

Thanks for catching that, I forgot I had to use that function. The reason I used this function was because I got an error when trying to build the Blib library using the speclib file. See below. I'm not sure which of the ERRORs is the root cause, but I can confirm that the peptide AAAGELQEDSGLC(UniMod:4)VLAR2 is present in the parquet that is being referenced. I'm not sure if it matters, but the first ERROR mentions"precursorId", but the column in my parquet is "Precursor.Id".

If you have any thoughts on these errors, that would be helpful.

Todd
---------------------------
Skyline-daily
---------------------------
ERROR: could not find precursorId 'AAAGELQEDSGLC(UniMod:4)VLAR2' in speclib; is 'report_HumanExp_jck.parquet' the correct report TSV file?
ERROR: reading file report_HumanExp_jck-lib.parquet.skyline.speclib
ERROR: boost::filesystem::remove: The process cannot access the file because it is being used by another process [system:32]: "E:\Data\_CHDI\20250418_hCSF_diapasef\hCSF_DIANN21_DNARHfiltered_wizard.redundant.blib"

Command-line: C:\Users\enoni\AppData\Local\Apps\2.0\R8KJE96M.0AL\5P814CJB.DXZ\skyl..tion_9286511f3362df93_0019.0000_b3f34f3a54bf06ec\BlibBuild -s -A -H -o -c 0.95 -i hCSF_DIANN21_DNARHfiltered_wizard -S "E:\Data\_CHDI\20250418_hCSF_diapasef\hCSF_DIANN21_DNARHfiltered_wizard.redundant202505231141.stdin.txt" "E:\Data\_CHDI\20250418_hCSF_diapasef\hCSF_DIANN21_DNARHfiltered_wizard.redundant.blib"
Working directory: E:\Data\_CHDI\20250418_hCSF_diapasef
Exit code: 1
---------------------------
OK More Info
---------------------------
Skyline-daily (64-bit) 25.0.9.131 (23f06989d4)

System.IO.IOException: ERROR: could not find precursorId 'AAAGELQEDSGLC(UniMod:4)VLAR2' in speclib; is 'report_HumanExp_jck.parquet' the correct report TSV file?
ERROR: reading file report_HumanExp_jck-lib.parquet.skyline.speclib
ERROR: boost::filesystem::remove: The process cannot access the file because it is being used by another process [system:32]: "E:\Data\_CHDI\20250418_hCSF_diapasef\hCSF_DIANN21_DNARHfiltered_wizard.redundant.blib"

Command-line: C:\Users\enoni\AppData\Local\Apps\2.0\R8KJE96M.0AL\5P814CJB.DXZ\skyl..tion_9286511f3362df93_0019.0000_b3f34f3a54bf06ec\BlibBuild -s -A -H -o -c 0.95 -i hCSF_DIANN21_DNARHfiltered_wizard -S "E:\Data\_CHDI\20250418_hCSF_diapasef\hCSF_DIANN21_DNARHfiltered_wizard.redundant202505231141.stdin.txt" "E:\Data\_CHDI\20250418_hCSF_diapasef\hCSF_DIANN21_DNARHfiltered_wizard.redundant.blib"
Working directory: E:\Data\_CHDI\20250418_hCSF_diapasef
Exit code: 1 ---> System.IO.IOException: ERROR: could not find precursorId 'AAAGELQEDSGLC(UniMod:4)VLAR2' in speclib; is 'report_HumanExp_jck.parquet' the correct report TSV file?
ERROR: reading file report_HumanExp_jck-lib.parquet.skyline.speclib
ERROR: boost::filesystem::remove: The process cannot access the file because it is being used by another process [system:32]: "E:\Data\_CHDI\20250418_hCSF_diapasef\hCSF_DIANN21_DNARHfiltered_wizard.redundant.blib"

Command-line: C:\Users\enoni\AppData\Local\Apps\2.0\R8KJE96M.0AL\5P814CJB.DXZ\skyl..tion_9286511f3362df93_0019.0000_b3f34f3a54bf06ec\BlibBuild -s -A -H -o -c 0.95 -i hCSF_DIANN21_DNARHfiltered_wizard -S "E:\Data\_CHDI\20250418_hCSF_diapasef\hCSF_DIANN21_DNARHfiltered_wizard.redundant202505231141.stdin.txt" "E:\Data\_CHDI\20250418_hCSF_diapasef\hCSF_DIANN21_DNARHfiltered_wizard.redundant.blib"
Working directory: E:\Data\_CHDI\20250418_hCSF_diapasef
Exit code: 1

Output:
Reading results from report_HumanExp_jck-lib.parquet.skyline.speclib.
Read 6167 entries from speclib.
Reading report headers.
Reading 6171 rows from report.
ERROR: could not find precursorId 'AAAGELQEDSGLC(UniMod:4)VLAR2' in speclib; is 'report_HumanExp_jck.parquet' the correct report TSV file?
ERROR: reading file report_HumanExp_jck-lib.parquet.skyline.speclib
100%
ERROR: boost::filesystem::remove: The process cannot access the file because it is being used by another process [system:32]: "E:\Data\_CHDI\20250418_hCSF_diapasef\hCSF_DIANN21_DNARHfiltered_wizard.redundant.blib"

 ---> System.IO.IOException: ERROR: could not find precursorId 'AAAGELQEDSGLC(UniMod:4)VLAR2' in speclib; is 'report_HumanExp_jck.parquet' the correct report TSV file?
ERROR: reading file report_HumanExp_jck-lib.parquet.skyline.speclib
ERROR: boost::filesystem::remove: The process cannot access the file because it is being used by another process [system:32]: "E:\Data\_CHDI\20250418_hCSF_diapasef\hCSF_DIANN21_DNARHfiltered_wizard.redundant.blib"

Command-line: C:\Users\enoni\AppData\Local\Apps\2.0\R8KJE96M.0AL\5P814CJB.DXZ\skyl..tion_9286511f3362df93_0019.0000_b3f34f3a54bf06ec\BlibBuild -s -A -H -o -c 0.95 -i hCSF_DIANN21_DNARHfiltered_wizard -S "E:\Data\_CHDI\20250418_hCSF_diapasef\hCSF_DIANN21_DNARHfiltered_wizard.redundant202505231141.stdin.txt" "E:\Data\_CHDI\20250418_hCSF_diapasef\hCSF_DIANN21_DNARHfiltered_wizard.redundant.blib"
Working directory: E:\Data\_CHDI\20250418_hCSF_diapasef
Exit code: 1
   at pwiz.Common.SystemUtil.ProcessRunner.Run(ProcessStartInfo psi, String stdin, IProgressMonitor progress, IProgressStatus& status, TextWriter writer, ProcessPriorityClass priorityClass, Boolean forceTempfilesCleanup, Func`3 outputAndExitCodeAreGoodFunc, Boolean updateProgressPercentage) in C:\proj\skyline_25_1\pwiz_tools\Shared\CommonUtil\SystemUtil\ProcessRunner.cs:line 206
   --- End of inner exception stack trace ---
   --- End of inner exception stack trace ---
   at pwiz.Common.SystemUtil.ProcessRunner.ThrowExceptionWithOutput(Exception exception, String output) in C:\proj\skyline_25_1\pwiz_tools\Shared\CommonUtil\SystemUtil\ProcessRunner.cs:line 266
   at pwiz.Common.SystemUtil.ProcessRunner.Run(ProcessStartInfo psi, String stdin, IProgressMonitor progress, IProgressStatus& status, TextWriter writer, ProcessPriorityClass priorityClass, Boolean forceTempfilesCleanup, Func`3 outputAndExitCodeAreGoodFunc, Boolean updateProgressPercentage) in C:\proj\skyline_25_1\pwiz_tools\Shared\CommonUtil\SystemUtil\ProcessRunner.cs:line 248
   at pwiz.BiblioSpec.BlibBuild.BuildLibrary(LibraryBuildAction libraryBuildAction, IProgressMonitor progressMonitor, IProgressStatus& status, String& commandArgs, String& messageLog, String[]& ambiguous) in C:\proj\skyline_25_1\pwiz_tools\Shared\BiblioSpec\BlibBuild.cs:line 493
   at pwiz.Skyline.Model.Lib.BiblioSpecLiteBuilder.BuildLibrary(IProgressMonitor progress) in C:\proj\skyline_25_1\pwiz_tools\Skyline\Model\Lib\BiblioSpecLiteBuilder.cs:line 152
---------------------------
 
Nick Shulman responded:  2025-05-23 10:11
That error "is 'report_HumanExp_jck.parquet' the correct report TSV file?" is usually caused by BiblioSpec finding the wrong .tsv (or I guess BiblioSpec supports .parquet now too) file to go with the .speclib that you chose.
DIA-NN makes it very easy to put the results from multiple peptide searches into the same folder.
When BiblioSpec builds a spectral library from a .speclib file, BiblioSpec gets information about m/z's and intensities from the .speclib file, but BiblioSpec needs to find information about scores and retention times from a report TSV file.
BiblioSpec has heuristics for finding the correct TSV file to go with the .speclib file, but, if BiblioSpec choose the wrong TSV file, that TSV file might be missing information for some of the peptides in the .speclib and you will get the error you are seeing.

You might be able to get this to work by deleting the file "report_HumanExp_jck.parquet", in which case BiblioSpec would have to use a different file which might be the correct one.
If that does not work you might need to run your DIA-NN search again and make sure the output folder does not contain results from any other searches.
-- Nick
 
Todd responded:  2025-05-23 11:07
Hi Nick,

Thanks for the suggestions, but still no luck.

I moved all files from the analysis into one directory and that generated same error message.
I deleted the parquet files, and that reported a different error that no associated report file could be found for the speclib.
I also re-ran the analysis. This is when I remembered that since I was analyzing a single file, DIANN does not automatically generate the "..skyline.speclib" file, only the parquet library. I generated the speclib file in a separate step by loading the report-lib.parquet into DIANN w/o any data and it generated the speclib. Perhaps this is at the core of the issue? Do you have any thoughts, or should I inquire on the DIA-NN support forum?

Todd
 
Nick Shulman responded:  2025-05-23 11:14
Can you zip up all the files you have and upload them here:
https://skyline.ms/files.url

This is the correct forum to be asking questions about importing DIA-NN results into Skyline.
-- Nick
 
Todd responded:  2025-05-23 11:21
I have uploaded a file containing the skyline share zip and all DIA-NN outputs, hCSF_DIANN21_DNARHfiltered_wizard.zip
Todd
 
Mike MacCoss responded:  2025-05-23 12:07
DIA-NN will only generate the Skyline.speclib file when you have multiple files and "MBR" checked.

Mike
 
Matt Chambers responded:  2025-05-23 12:19
One issue is iit really does seem like a bug that DIANN doesn't generate a .speclib file when it's given a single input file.

Read 6167 entries from speclib.
Reading report headers.
Reading 6171 rows from report.

The row count difference is suspicious. The speclib should always have more rows than the report, I think. The error is saying it can't find that precursorId in the speclib, not that it can't find it in the parquet.

But when I run the files you uploaded I get a different error and row counts:

Reading results from c:\test\issues\diann2-parquet\report_HumanExp_jck-lib.parquet.skyline.speclib.
Read 6213 entries from speclib.
Reading report headers.
Reading 6171 rows from report.
ERROR: could not find precursorId 'AIALDPR2' in speclib; is 'report_HumanExp_jck.parquet' the correct report TSV file?
ERROR: reading file c:\test\issues\diann2-parquet\report_HumanExp_jck-lib.parquet.skyline.speclib

And indeed, that precursor is not in the speclib, strongly suggesting that the parquet did not come from that speclib. Can you try analyzing 2 files so you get a speclib by default and see if the issue goes away?
 
Mike MacCoss responded:  2025-05-23 13:42
I believe it is intentional. Vadim mentioned that you can only make quantitative comparisons if there are multiple runs. If you have multiple runs then we should be using the two step search in DIANN to increase the detection of peptides between runs. In DIANN this requires turning on "MBR" ... which essentially reduces the library size and leaks information from the search of one replicate to another. So DIANN only outputs the skyline.speclib file if there are multiple runs and "--reanalyse" used from the command line or "MBR" checked in the GUI.
 
Todd responded:  2025-05-23 14:11
To Matt's comment:
Yes, I can confirm that when I generate the skyline.speclib in a workflow with 2+ files (MBR enabled), then it can be used to create a Skyline library with no errors.

To Mike's comment:
This makes sense to me too for quantitative comparisons.

In this rare case of having one file that we want quick evaluation, I read on the DIANN forums about converting the parquet lib to a skyline.speclib by loading the parquet library into DIANN, but not queuing the raw data, and enabling generate spectral library. This does create a skyline.speclib file, but from our testing here, there is something not quite right about it, as it is missing precursors that are in the main report. Vadim is working on improved handling in DIA-NN v2.2, so I can mention this to him and see if there are plans to support this rare use-case.

Thanks for everyone's troubleshooting and comments.

Todd