Title | | » | Investigate more efficient sorting of three-array IMS scans - possibly the locally sorted "sawtooth" nature of the mz could be exploited |
Assigned To | | » | Brian Pratt |
Type | | » | Todo |
Area | | » | Skyline |
Priority | | » | 3 |
The {mz, intensity, IM} sorting step shows up pretty hot in profiling. The fact that the mzs are locally sorted seems like useful knowledge that the standard sort algorithms aren't likely to exploit, and it occurs to me that an initial pass that runs through the scan looking for the sawtooth might be a performance boost
A toy example: for mz values from the concatenated mz runs
100, 200, 300, 399
101, 199, 305, 401
99, 105,
98, 300
do an initial ordering of indexes based on walking the mz array watching for the end of a rising mz run (a "sawtooth") and splitting each sawtooth at midrange with the lower half indexes copied to one array, and the others copied to a different array. Then the two arrays that merge at the end, and we sort that.
That gives an effective starting order
100, 200, 101, 199, 99, 98, 300, 399, 305, 401, 105, 300
which would involve less moving around in the sort
Possibly doing this with more than two segments would be fruitful - maybe examine the first 10% of the mz values to determine mz range and then split on quartiles, for example
create 4 integer arrays same size as mz length
create 4 double arrays same size as mz length
find mz range in mz buffer
for each mz
copy value to appropriate quartile double array
copy index to appropriate quartile integer array
combine quartile arrays into single index array and mz array
Array.Sort(mz array, index array)
rearrange {mz, intensity, IM) based on sorted index array