Title | | » | Improve performance of BlibBuild spectral library building from pepXML/mzXML |
Assigned To | | » | kaipot@u.washington.edu |
Type | | » | Todo |
Area | | » | BiblioSpec |
Priority | | » | 2 |
Milestone | | » | 3.2 |
It is currently taking way longer than it should to build spectral libraries from pepXML/mzXML for DIA-Umpire results searched with the TPP (an important use case). This appears to be due to the way we retrieve spectra from the mzXML file. We are currently enumerating the matched spectra and using random-access spectrum reading to retrieve the spectra from the mzXML file. It should be much faster to read all spectra in the file sequentially, skipping the ones for which we have no match and loading the ones for which we have a match. Sequential reading is always much faster in spinning media and can often be faster even on SSD, because random access rarely allows buffered reading, but instead must seek and load each spectrum individually.