Joachim Rosskopf, Korbinian Paul-Yuan, Martin B. Plenio, and Jens Michaelis
Analyzing the physical and chemical properties of single DNA-based molecular machines such as polymerases and helicases requires to track stepping motion on the length scale of base pairs. Although high-resolution instruments have been developed that are capable of reaching that limit, individual steps are oftentimes hidden by experimental noise which complicates data processing. Here we present an effective two-step algorithm which detects steps in a high-bandwidth signal by minimizing an energy-based model (energy-based step finder, EBS). First, an efficient convex denoising scheme is applied which allows compression to tuples of amplitudes and plateau lengths. Second, a combinatorial clustering algorithm formulated on a graph is used to assign steps to the tuple data while accounting for prior information. Performance of the algorithm was tested on Poissonian stepping data simulated based on published kinetics data of RNA polymerase II (pol II). Comparison to existing step-finding methods shows that EBS is superior in speed while providing competitive step-detection results, especially in challenging situations. Moreover, the capability to detect backtracked intervals in experimental data of pol II as well as to detect stepping behavior of the Phi29 DNA packaging motor is demonstrated.