# the MAD Seminar

The MaD seminar features leading specialists at the interface of Applied Mathematics, Statistics and Machine Learning.

**Room:** Auditorium Hall 150, Center for Data Science, NYU, 60 5th ave.

**Time:** 2:00pm-3:00pm

**Subscribe to the Seminar Mailing list here**

### Schedule with Confirmed Speakers

### Abstracts

#### Jin-Peng Liu: Quantum for Science: Efficient Quantum Algorithms for Linear and Nonlinear Dynamics

Fault-tolerant quantum computers are expected to excel in simulating unitary dynamics, such as the dynamics of a quantum state under a Hamiltonian. Most applications in scientific and engineering computations involve non-unitary and/or nonlinear dynamics. Therefore, efficient quantum algorithms are the key for unlocking the full potential of quantum computers to achieve comparable speedup in these general tasks.

First, we propose a simple method for simulating a general class of non-unitary dynamics as a linear combination of Hamiltonian simulation (LCHS) problems. The LCHS method can achieve optimal cost in terms of state preparation [1]. Second, we give the first efficient (polynomial time) quantum algorithm for nonlinear differential equations with sufficiently strong dissipation. This is an exponential improvement over the best previous quantum algorithms, whose complexity is exponential in the evolution time [2]. Our work shows that fault-tolerant quantum computing can potentially address complex non-unitary and nonlinear phenomena in natural and data sciences with provable efficiency [3].

References:

[1] Linear combination of Hamiltonian simulation for non-unitary dynamics with optimal state preparation cost. Physical Review Letters, 131(15):150603 (2023).

[2] Efficient quantum algorithm for dissipative nonlinear differential equations. Proceedings of the National Academy of Science 118, 35 (2021).

[3] Towards provably efficient quantum algorithms for large-scale machine learning models. Nature Communications 15, 434 (2024).

#### Yaqi Duan: Taming “data-hungry” reinforcement learning? Stability in continuous state-action spaces

We introduce a novel framework for analyzing reinforcement learning (RL) in continuous state-action spaces, and use it to prove fast rates of convergence in both off-line and on-line settings. Our analysis highlights two key stability properties, relating to how changes in value functions and/or policies affect the Bellman operator and occupation measures. We argue that these properties are satisfied in many continuous state-action Markov decision processes, and demonstrate how they arise naturally when using linear function approximation methods. Our analysis offers fresh perspectives on the roles of pessimism and optimism in off-line and on-line RL, and highlights the connection between off-line RL and transfer learning.

#### Cun-Hui Zhang: Adaptive Inference in Sequential Experiments

Sequential data collection has emerged as a widely adopted technique for enhancing the efficiency of data gathering processes. Despite its advantages, such data collection mechanism often introduces complexities to the statistical inference procedure. For instance, the ordinary least squares estimator in an adaptive linear regression model can exhibit non-normal asymptotic behavior, posing challenges for accurate inference and interpretation. We propose a general method for constructing debiased estimator which remedies this issue. The idea is to make use of adaptive linear estimating equations. We establish theoretical guarantees of asymptotic normality, supplemented by discussions on achieving near-optimal asymptotic variance. This talk is based on joint work with Mufang Ying and Koulik Khamaru.

#### Zach Izzo: Data-driven Subgroup Identification

Medical studies frequently require to extract the relationship between each covariate and the outcome with statistical confidence measures. To do this, simple parametric models are frequently used (e.g. coefficients of linear regression) but usually fitted on the whole dataset. However, it is common that the covariates may not have a uniform effect over the whole population and thus a unified simple model can miss the heterogeneous signal. For example, a linear model may be able to explain a subset of the data but fail on the rest due to the nonlinearity and heterogeneity in the data. In this talk, I will discuss DDGroup (data-driven subgroup discovery), a data-driven method to effectively identify subgroups in the data with a uniform linear relationship between the features and the label. DDGroup outputs an interpretable region in which the linear model is expected to hold. It is simple to implement and computationally tractable for use. It also comes with statistical guarantees: given a large enough sample, DDGroup recovers a region where a single linear model with low variance is well-specified (if one exists), and experiments on real-world medical datasets confirm that it can discover regions where a local linear model has improved performance. Our experiments also show that DDGroup can uncover subgroups with qualitatively different relationships which are missed by simply applying parametric approaches to the whole dataset. Time permitting, I will also discuss the challenges of extending DDGroup to other models.