Dear Skyline team,
I extracted the RefSpectraPeaks table from a .blib file in Python using the code below and then decoded the peakMZ and peakIntensity columns:
import pandas as pd
import numpy as np
import sqlite3
import zlib
def decode_peaks(binary_data, dtype=np.float64):
"""
Decode zlib-compressed peak data from Skyline .blib files
Parameters:
binary_data: bytes object from peakMZ or peakIntensity column
dtype: numpy dtype (default: np.float64)
Returns:
numpy array of values
"""
if binary_data is None or len(binary_data) == 0:
return np.array([])
# Decompress the zlib-compressed data
decompressed = zlib.decompress(binary_data)
# Try float64 first, then float32 if that fails
try:
values = np.frombuffer(decompressed, dtype=np.float64)
except ValueError:
# If float64 doesn't work, try float32
values = np.frombuffer(decompressed, dtype=np.float32)
return values
with sqlite3.connect('input/LIT_GPF_survey_newAlign_MMCC_boundaries_opttrans_nochick.blib') as conn:
ref_spectra_df = pd.read_sql_query('SELECT * FROM RefSpectra', conn)
ref_spectra_peaks_df = pd.read_sql_query('SELECT * FROM RefSpectraPeaks', conn)
# Decode the columns
ref_spectra_peaks_df['peakMZ'] = ref_spectra_peaks_df['peakMZ'].apply(decode_peaks)
ref_spectra_peaks_df['peakIntensity'] = ref_spectra_peaks_df['peakIntensity'].apply(decode_peaks)
# record the length of every m/z and intensity array
len_recorder = []
for _, row in ref_spectra_peaks_df.iterrows():
len_recorder.append([len(row['peakMZ']), len(row['peakIntensity'])] )
In len_recorder, the lengths look like the following:
[[775, 775],
[1320, 660],
[724, 362],
[814, 407],
[1083, 1083],
[1170, 585],
[954, 477],
[1022, 511],
[1310, 655],
[1285, 1285],
[900, 450],
[820, 410],
[1273, 1273],
[1381, 1381],
[1134, 567],
[595, 595],
[708, 354],
[1352, 676],
[980, 490],
[830, 415],
[1327, 1327],
[1181, 1181],
[1131, 1131],
[1319, 1319],
[1087, 1087],
[821, 821],
[929, 929], ...]
Some of the m/z and intensity arrays have the same length, while in other cases the m/z arrays have exactly twice the length of the corresponding intensity arrays. The .blib file is too big to be attached. It is under this project: https://panoramaweb.org/stellar-biofluid-prm.url. /SkylineFiles/LIT_GPF_survey_newAlign_MMCC_boundaries_opttrans_nochick_2024-06-02_15-26-11/LIT GPF survey newAlign MMCC boundaries opttrans nochick.blib.
Thank you very much for your help!