# 2022 Program for Women and Mathematics: The Mathematics of Machine Learning

## Young Researcher Seminar

Alane Lima, Federal University of Parana

4:30 pm -4:50 pm

Title: Statistical Learning Theory and Sampling Algorithms for Complex Networks

Abstract: When dealing with problems in large scale graphs, using an exact algorithm may be inefficient in practice. A key idea in our work is to use a sampling based approximation algorithm that takes into consideration the combinatorial structure of the problem. Such algorithms may achieve tighter sample size bounds in comparison with strategies that use classical bounds of Probability theory. The study of the sample complexity analysis of a particular problem, which uses tools that are in the core of statistical learning theory, such as VC-dimension, pseudo-dimension, and Rademacher complexity, relates the minimum size that is required for a particular problem and instance to attain desired parameters of quality and confidence of the solution. When applied in the design of approximation algorithms for well-known problems in graphs (e.g. shortest paths and a variety of centrality measures), this approach leads to efficient solutions for problems of practical interest.

Longxiu Huang, University of California, Los Angeles

4:50 pm - 5:10 pm

Title: Matrix Completion with Cross-Concentrated Sampling

Abstract: In matrix completion, uniform sampling has been widely studied and CUR sampling can be applied to approximate a low-rank matrix via row and column samples. Unfortunately, both sampling models are lack of flexibility for various circumstances in real-world applications. Recently, we propose a novel and easy-to-implement sampling strategy, coined Cross-Concentrated Sampling (CCS). By bridging uniform sampling and CUR sampling, CCS provides extra flexibility that can potentially save sampling costs in applications. Moreover, we propose a highly efficient non-convex algorithm, termed Iterative CUR Completion (ICURC), for the proposed CCS model. In this talk, I will show the efficiency of our methods on both synthetic and real-world datasets.

Sui Tang, University of California, Santa Barbara

5:10 pm - 5:30 pm

Title: Data-driven discovery of interaction laws in multi-agent systems

Abstract: Multi-agent systems are ubiquitous in science, from the modeling of particles in Physics to prey-predator in Biology, to opinion dynamics in economics and social sciences, where the interaction law between agents yields a rich variety of collective dynamics. We consider the inference problem for a system of interacting particles or agents: given only observed trajectories of the agents in the system, can we learn what the laws of interactions are? We would like to do this without assuming any particular form for the interaction laws, i.e. they might be “any” function of pairwise distances.