Issue 950: Investigate more efficient sorting of three-array IMS scans - possibly the locally sorted "sawtooth" nature of the mz could be exploited

issues
Status:closed
Assigned To:Guest
Type:Todo
Area:Skyline
Priority:3
Milestone:4.3
Opened:2023-04-03 10:36 by Brian Pratt
Changed:2023-04-03 13:34 by Brian Pratt
Resolved:2023-04-03 13:34 by Brian Pratt
Resolution:Won't Fix
Closed:2023-04-03 13:34 by Brian Pratt
2023-04-03 10:36 Brian Pratt
Title»Investigate more efficient sorting of three-array IMS scans - possibly the locally sorted "sawtooth" nature of the mz could be exploited
Assigned To»Brian Pratt
Type»Todo
Area»Skyline
Priority»3
The {mz, intensity, IM} sorting step shows up pretty hot in profiling. The fact that the mzs are locally sorted seems like useful knowledge that the standard sort algorithms aren't likely to exploit, and it occurs to me that an initial pass that runs through the scan looking for the sawtooth might be a performance boost

A toy example: for mz values from the concatenated mz runs
100, 200, 300, 399
101, 199, 305, 401
 99, 105,
 98, 300

do an initial ordering of indexes based on walking the mz array watching for the end of a rising mz run (a "sawtooth") and splitting each sawtooth at midrange with the lower half indexes copied to one array, and the others copied to a different array. Then the two arrays that merge at the end, and we sort that.

That gives an effective starting order
100, 200, 101, 199, 99, 98, 300, 399, 305, 401, 105, 300
which would involve less moving around in the sort

Possibly doing this with more than two segments would be fruitful - maybe examine the first 10% of the mz values to determine mz range and then split on quartiles, for example
create 4 integer arrays same size as mz length
create 4 double arrays same size as mz length
find mz range in mz buffer
for each mz
   copy value to appropriate quartile double array
   copy index to appropriate quartile integer array
combine quartile arrays into single index array and mz array
Array.Sort(mz array, index array)
rearrange {mz, intensity, IM) based on sorted index array

2023-04-03 13:34 Brian Pratt
resolve as Won't Fix
Statusopen»resolved
I tinkered with this a bit, the native sort seems to be as efficient as possible.

2023-04-03 13:34 Brian Pratt
close
Statusresolved»closed
Assigned ToBrian Pratt»Guest