Skip to content

SODA 2023

March 1, 2023


Traces of strings, plus ways of tracing accepted papers

Anindya De was at Northwestern University and is now at the University of Pennsylvania—see here.

He was advised by two of the top advisors ever there were: Luca Trevisan and Umesh Vazirani.

Traces

I recently ran across a great paper by Anindya titled Approximate Trace Reconstruction from a Single Trace. It is co-authored with Xi Chen (Columbia University), Chin Ho Lee (Harvard University), and Rocco Servedio and Sandip Sinha (Columbia University). Notice that we did not put an Oxford comma between Servedio and Sinha as they are both from Columbia. The paper appeared at SODA 2023 this January.

Here are pointers to the almost 200 papers that were in the conference. I put this together before discovering the site conference-publishing.com, which as mentioned in my STOC 2023 post generates paper lists with links for a host of conferences. So I did all the following links myself. Do scroll past the list to the bottom to read a little more about traces which Ken and I put together.

  1. Small Shadows of Lattice Polytopes

  2. Fair allocation of a multiset of indivisible items

  3. Hierarchies of Minion Tests for PCSPs through Tensors

  4. Approximate Graph Colouring and Crystals

  5. The Price of Stability for First Price Auction

  6. Spencer’s theorem in nearly input-sparsity time

  7. Spatial Mixing and the random-cluster dynamics on lattices

  8. Nonlinear codes exceeding the Gilbert-Varshamov and Tsfasman-Vladut-Zink bounds

  9. A Near-Linear Time Sampler for the Ising Model with External Field

  10. Concentration of polynomial random matrices via Efron-Stein inequalities

  11. Quantum Speed-ups for String Synchronizing Sets, Longest Common Substring, and kmismatch Matching

  12. Halving by a Thousand Cuts or Punctures

  13. On the number incidences when avoiding an induced biclique in geometric settings

  14. Subexponential mixing for partition chains on grid-like graphs

  15. Weisfeilera Leman and Graph Spectra

  16. Stronger 3SUM-Indexing Lower Bounds

  17. Tight Bounds for Monotone Minimal Perfect Hashing

  18. Almost Consistent Systems of Linear Equations

  19. Testing and Learning Quantum Juntas Nearly Optimally

  20. Testing Convex Truncation

  21. Player-optimal Stable Regret for Bandit Learning in Matching Markets

  22. Competitive Information Design for Pandoras Box

  23. Breaking the O(n)-Barrier in the Construction of Compressed Suffix Arrays and Suffix Trees

  24. Short Synchronizing Words for Random Automata

  25. Packing cycles in planar and bounded-genus graphs

  26. Improved Distributed Algorithms for the Lovasz Local Lemma and Edge Coloring

  27. Optimal Fully Dynamic {k}-Center Clustering for Adaptive and Oblivious Adversaries

  28. A logic-based algorithmic meta-theorem for mim-width

  29. Tiny Pointers

  30. Streaming complexity of CSPs with randomly ordered constraints

  31. Computing Square Colorings on Bounded-Treewidth and Planar Graphs

  32. Near-Linear Time Approximations for Cut Problems via Fair Cuts

  33. Stronger Privacy Amplification by Shuffling for Rényi and Approximate Differential Privacy

  34. Minimizing Completion Times for Stochastic Jobs via Batched Free Times

  35. Optimal Pricing Schemes for an Impatient Buyer

  36. Online Prediction in Sub-linear Space

  37. Fast Discrepancy Minimization with Hereditary Guarantees

  38. Exact Flow Sparsification Requires Unbounded Size

  39. Curve Simplification and Clustering under Frechet Distance

  40. Almost Tight Bounds for Online Facility Location in the Random-Order Model

  41. The complete classification for quantified equality constraints

  42. Small subgraphs with large average degree

  43. Mean estimation when you have the source code; or, quantum Monte Carlo methods

  44. Online Min-Max Paging

  45. Gap-ETH-Tight Approximation Schemes for Red-Green-Blue Separation and BicoloredEuclidean Travelling Salesman Tours

  46. Map matching queries on realistic input graphs under the Frachet distance

  47. Private Query Release via the Johnson Lindenstrauss Transform

  48. Efficient decoding up to a constant fraction of the code length for asymptotically goodquantum codes

  49. Passing the Limits of Pure Local Search for weighted k-Set Packing

  50. Improved Bounds for Sampling Solutions of Random CNF Formulas

  51. Pricing Query Complexity of Revenue Maximization

  52. Flow-augmentation III: complexity dichotomy for Boolean CSPs parameterized by thenumber of unsatisfied constraints

  53. Parameter tractability of Directed Multicut with three terminal pairs parameterizedby the size of the cutset: twin-width meets flow-augmentation

  54. Approximate Trace Reconstruction from a Single Trace

  55. Dynamic Algorithms for Packing-Covering LPs via Multiplicative Weight Updates

  56. Sharp threshold sequence and universality for Ising perceptron models

  57. Maintaining Expander Decompositions via Sparse Cuts

  58. Near Optimal Analysis of the Cube versus Cube Test

  59. Approximating Knapsack and Partition via Dense Subset Sums

  60. Online and Bandit Algorithms for Norms with Gradient-Stable Approximations

  61. On complex roots of the independence polynomial

  62. Simplex Range Searching Revisited: How to Shave Logs in Multi-Level Data Structures

  63. Improved Pattern-Avoidance Bounds for Greedy BSTs via Matrix Decomposition

  64. Moser-Tardos Algorithm: Beyond Shearer’s Bound

  65. Deterministic counting Lovasz local lemma beyond linear programming

  66. Conflict-free hypergraph matchings

  67. A Subquadratic {n^\epsilon}-approximation for the Continuous Frachet Distance

  68. Interdependent Public Projects

  69. A Nearly Time-Optimal Distributed Approximation of Minimum Cost {k}-Edge-Connected Spanning Subgraph

  70. A tight quasi-polynomial bound for Global Label Min-Cut

  71. Polynomial formulations as a barrier for reduction-based hardness proofs

  72. Faster Algorithm for Turn-based Stochastic Games with Bounded Treewidth

  73. Instability of backoff protocols with arbitrary arrival rates

  74. On the orbit closure intersection problems for matrix tuples under conjugation and leftright actions

  75. Constant Approximating Parameterized k-SetCover is W[2]-hard

  76. Online Lewis Weight Sampling

  77. Toeplitz Low-Rank Approximation with Sublinear Query Complexity

  78. Kernelization for Graph Packing Problems via Rainbow Matching

  79. Improved Integrality Gap in Max-Min Allocation: or Topology at the North Pole

  80. Generalized Unrelated Machine Scheduling Problem

  81. Tight Complexity Bounds for Counting Generalized Dominating Sets in Bounded-Treewidth Graphs

  82. An Improved Approximation for Maximum Weighted k-Set Packing

  83. Excluding Single-Crossing Matching Minors in Bipartite Graphs

  84. Byzantine Agreement with Optimal Resilience via Statistical Fraud Detection

  85. Finding Triangles and Other Small Subgraphs in Geometric Intersection Graphs

  86. Simple, deterministic, fast (but weak) approximations to edit distance and Dyck edit distance

  87. From algorithms to connectivity and back: finding a giant component in random k-SAT

  88. Sparse graphs with bounded induced cycle packing number have logarithmic treewidth

  89. Shrunk subspaces via operator Sinkhorn iteration

  90. Improved Approximation Algorithms for Unrelated Machines

  91. Model-Checking for First-Order Logic with Disjoint Paths Predicates in Proper MinorClosed Graph Classes

  92. A Distanced Matching Game, Decremental APSP in Expanders, and Faster DeterministicAlgorithms for Graph Cut Problems

  93. Super-resolution and Robust Sparse Continuous Fourier Transform in Any Constant Dimension: Nearly Linear Time and Sample Complexity

  94. A Polynomial Time Algorithm for Finding a Minimum 4-Partition of a Submodular Function

  95. Positivity of the symmetric group characters is PH-hard

  96. Parallel Exact Shortest Paths in Almost Linear Work and Square Root Depth

  97. On the Integrality Gap of MFN Relaxation for the Capacitated Facility Location Problem

  98. Weak Bisimulation Finiteness of Pushdown Systems With Deterministic Transitions-ExpTime-Complete

  99. Beating Greedy Matching in Sublinear Time

  100. Integrality Gaps for Random Integer Programs via Discrepancy

  101. Improved Approximation for Two-Edge-Connectivity

  102. Zigzagging through acyclic orientations of chordal graphs and hypergraphs

  103. Shortest Cycles With Monotone Submodular Costs

  104. Traversing the FFT Computation Tree for Dimension-Independent Sparse FourierTransforms

  105. Fixed-Parameter Tractability of Maximum Colored Path and Beyond

  106. A half-integral Erdős-Pósa theorem for directed odd cycles

  107. Improved Bi-point Rounding Algorithms and a Golden Barrier for {k}-Median

  108. Approximate Distance Oracles for Planar Graphs with Subpolynomial Error Dependency

  109. A Framework for Approximation Schemes on Disk Graphs

  110. Cubic Goldreich-Levin

  111. Closing the Gap Between Directed Hopsets and Shortcut Sets

  112. “Who is Next in Line?” On the Significance of Knowing the Arrival Order in Bayesian Online Settings

  113. Smaller Low-Depth Circuits for Kronecker Powers

  114. Fast algorithms for solving the Hamilton Cycle problem with high probability

  115. On Minimizing Tardy Processing Time, Max-Min Skewed Convolution, and TrianglarStructured ILPs

  116. Maximal k-Edge-Connected Subgraphs in Weighted Graphs via Local Random Contraction

  117. A simple and sharper proof of the hypergraph Moore bound

  118. A Sublinear-Time Quantum Algorithm for Approximating Partition Functions

  119. On Problems Related to Unbounded SubsetSum: A Unified Combinatorial Approach

  120. Query Complexity of the Metric Steiner Tree Problem

  121. Sublinear-Time Algorithms for Max Cut, Max E2Lin{(q)}, and Unique Label Cover on Expanders

  122. Near-Linear Sample Complexity for {L_p} Polynomial Regression

  123. On (Random-order) Online Contention Resolution Schemes for the Matching Polytope of (Bipartite) Graphs

  124. Optimal Algorithms for Linear Algebra in the Current Matrix Multiplication Time

  125. Algebraic Algorithms for Fractional Linear Matroid Parity via Non-commutative Rank

  126. Almost Tight Error Bounds on Differentially Private Continual Counting

  127. Secretary Problems: The Power of a Single Sample

  128. Robust Voting Rules from Algorithmic Robust Statistics

  129. Faster and Unified Algorithms for Diameter Reducing Shortcuts and Minimum Chain Cover

  130. Improved girth approximation in weighted undirected graphs

  131. Online Sorting and Translational Packing of Convex Polygons

  132. Fully Dynamic Exact Edge Connectivity in Sublinear Time

  133. Algorithmizing the Multiplicity Schwartz-Zippel Lemma

  134. A Nearly Tight Analysis of Greedy k-means++

  135. A polynomial-time algorithm for 1/2-well-supported Nash equilibria in bimatrix games

  136. Bidder subset selection problem in auction design

  137. Massively Parallel Computation on Embedded Planar Graphs

  138. Balanced Allocations: The Power of Memory with Heterogeneous Bins

  139. Simple Mechanisms for Agents with Non-linear Utilities

  140. The Power of Clairvoyance for Multi-Level Aggregation and Set Cover with Delay

  141. Steiner Connectivity Augmentation and Splitting-off in Poly-logarithmic Maximum Flows

  142. Discrepancy minimization via regularization

  143. Equivalence Test for Read-Once Arithmetic Formulas

  144. Time-Space Tradeoffs for Element Distinctness and Set Intersection via Pseudorandomness

  145. Local Distributed Rounding: Generalized to MIS, Matching, Set Cover, and Beyond

  146. Parameterized Approximation Scheme for Biclique-free Max k-Weight SAT and Max Coverage

  147. Improved Distributed Network Decomposition, Hitting Sets, and Spanners, via Derandomization

  148. Breaching the 2 LMP Approximation Barrier for Facility Location with Applications to kMedian

  149. Query Complexity of Inversion Minimization on Trees

  150. Faster Deterministic Worst-Case Fully Dynamic All-Pairs Shortest Paths via Decremental Hop-Restricted Shortest Paths

  151. Optimal Square Detection Over General Alphabets

  152. Differentially Private All-Pairs Shortest Path Distances: Improved Algorithms and Lower Bounds

  153. Faster Computation of 3-Edge-Connected Components in Digraphs

  154. Sampling Equilibria: Fast No-Regret Learning in Structured Games

  155. Higher degree sum-of-squares relaxations robust against oblivious outliers

  156. Fast Distributed Brooks Theorem

  157. Graph Classes With Few Minimal Separators. I. Finite Forbidden Subgraphs

  158. Optimal Deterministic Massively Parallel Connectivity on Forests

  159. Foundations of Transaction Fee Mechanism Design

  160. Graph Classes With Few Minimal Separators. II. A Dichotomy

  161. Distributed Maximal Matching and Maximal Independent Set on Hypergraphs

  162. Quantum tomography using state-preparation unitaries

  163. Almost-Linear Planted Cliques Elude the Metropolis Process

  164. Timeliness Through Telephones

  165. Efficient resilient functions

  166. Dynamic Matching with Better-than-2 Approximation in Polylogarithmic Update Time

  167. The Need for Seed (in the abstract Tile Assembly Model)

  168. Private Convex Optimization in General Norms

  169. Dynamic Algorithms for Maximum Matching Size

  170. Unique Games hardness of Quantum Max-Cut, and a conjectured vector-valued Borell’s inequality

  171. A Nearly-tight Analysis of Multipass Pairing Heaps

  172. A Tight Analysis of Slim Heaps and Smooth Heaps

  173. Lossless Online Rounding for Online Bipartite Matching (Despite its Impossibility)

  174. Approximation Algorithms for Steiner Tree Augmentation Problems

  175. Low Degree Testing over the Reals

  176. Streaming algorithms for the missing item finding problem

  177. Economical Convex Coverings and Applications

  178. Interactive Coding with Small Memory

  179. Non-Stochastic CDF Estimation Using Threshold Queries

  180. Elliptic Curve Fast Fourier Transform Part I : Low-degree extension in time O(n log n) over all finite fields

  181. Single-Pass Streaming Algorithms for Correlation Clustering

  182. A New Approach to Estimating Effective Resistances and Counting Spanning Trees in Expander Graphs

  183. Superpolynomial Lower Bounds for Decision Tree Learning and Testing

  184. The {\ell_p}-Subspace Sketch Problem in Small Dimensions with Applications to Support Vector Machines

  185. Beating {(1-1/e)}-Approximation for Weighted Stochastic Matching

  186. Towards Multi-Pass Streaming Lower Bounds for Optimal Approximation of Max-Cut

  187. Parameterized Algorithm for the Planar Disjoint Paths Problem: Exponential in {k^2}, and Linear in {n}

  188. Learning Hierarchical Cluster Structure of Graphs in Sublinear Time

  189. The Exact Bipartite Matching Polytope Has Exponential Extension Complexity

  190. 4D Range Reporting in the Pointer Machine Model in Almost-Optimal Time

The Trace Result

The trace problem begins by sending a binary string {x} of length {n} through a deletion channel with parameter {\delta \in [0,1]}. Each bit {x_i} entering the channel survives with probability {1 - \delta} to be part of the output string {y}. That is, {x_i} is deleted with probability {\delta}. The deletions are independent. For an unknown string {x}, the problem is:

Given {k} strings {y_1,\dots,y_k} produced by {k} runs of the channel on {x}, reconstruct {x} if possible. Else, calculate a binary string {x'} of length {n} that minimizes a distance metric {d(x,x')}. The metric of choice is to maximize the length {\ell(x,x')} of the longest common subsequence (not necessarily contiguous) of {x} and {x'}, which corresponds to minimizing their edit distance.

As indicated by its title “Approximate Trace Reconstruction from a Single Trace,” the paper tackles the extreme case {k=1}. Of course one cannot reconstruct {x} (unless no deletions occur so {y = x}) so the game is to find {x'} that are most likely to have produced the lone observed {y}. The scoring function takes the expectation of {\ell(x',x)} over both the generation of {y} from the true {x} and the run of the algorithm guessing {x'} from {y}. There are two main questions:

  • How well does the algorithm perform—relative to theoretically optimal choices given {y}—when {x} itself is generated uniformly at random?

  • How well does the algorithm perform when {x} is generated adversarially? Note that {y} is still probabilistic, and the performance of both the theoretical optimal algorithm and their algorithm are evaluated based on the distribution of {y} for the fixed (unseen) {x}.

These questions are posed for small, medium, and large values of {\delta}. When the deletion probability is close to {1}, the strings {y} are most often tiny. One would think they offer no help in coming close to {x}. However, they do help efficient algorithms come close to the optimal policy for a worst-case chosen {x}. The paradoxical results of their paper, in their own words (but reversing their order), are:

  1. In the average-case setting, having access to a single trace is provably not very useful: no algorithm, computationally efficient or otherwise, can achieve significantly higher accuracy given one trace that is {o(n)} bits long than it could with no traces.

  2. Having access to a single trace is already quite useful for worst-case trace reconstruction: an efficient algorithm can perform much more accurate reconstruction, given one trace that is even only a few bits long, than it could given no traces at all.

The deep point is that when {x} as well as {y} is random, seeing {y} gives little advantage to both the optimal strategy (which does not know {x}) and their algorithm. Whereas, when {x} is fixed, the knowledge of {y} is more valuable to the optimal strategy and separates it from the case of not seeing {y} at all. However, the profit given by even a short {y} is one that is apprehendable by a complexity-limited deterministic algorithm that sees only {y}. That’s our attempt at an intuitive takeaway; as always we invite readers to consult the paper in detail.

Open Problems

Comparing my list of pointer to the papers from SODA, which was a bit of trouble to create by hand, to the STOC’23 output from the conference-publishing site, leads to a curious question:

Do we scan lists of papers more by looking for subject words in their titles or looking for authors we know?

Well, I have not found SODA’23 on that website, where authors too would be given; for me, copying the authors would more than double the manual work.

No comments yet

Leave a Reply

Discover more from Gödel's Lost Letter and P=NP

Subscribe now to keep reading and get access to the full archive.

Continue reading