CS & Applied Math Papers
Published:
Overview
Latest CS & math papers with a focus on ML/AI and some supplemental resources. Curated with a focus on:
- Building Foundational Knowledge: Resources to understand the mathematical and conceptual foundations for the latest research topics
- Understanding Frontier Research: Latest papers pushing the boundaries of each field
- Identifying Limitations & Directions: Critical analysis of current challenges and future research opportunities
Created by Nicole Hao, 2025
Table of Contents
- AI Foundations, Must Reads
- Graph Neural Networks (GNNs)
- AI4Health
- Agentic AI
- AI4S
- Physics - Physics-informed Neural Networks (PINNs)
- Embodied AI
- Artificial General Intelligence (AGI)
- Not Yet Categorized
- Liquid Neural Networks (LNNs)
- Neural Tangent Kernels (NTKs)
- O(3)-Equivariant Deep Networks
- Real-time Adaptation Laws for Neural Networks
- Model Compression Techniques Through Numerical Linear Algebra
AI Foundations, Must Reads
- ResNet (297,000+ citations, 2016): A cornerstone of deep learning; ‘solved’ the difficulty of training very deep networks
- Adam (200,000+ citations, 2014): The most widely used optimizer; nearly all large models are trained with it
- AlexNet (187,000+ citations, 2012): The starting point of the deep learning boom; kicked off GPU-based training
- Attention (Transformer) (173,000+ citations, 2017): The “bible” of large models; the ancestor of all LLMs like ChatGPT
- LSTM (140,000+ citations, 1997): A pioneer of sequence modeling; dominated NLP for 20 years
- BERT (120,000+ citations, 2018): Introduced the “pretraining + fine-tuning” paradigm for large models
- Deep Learning (Review) (106,000+ citations, 2015): A review by the three giants (LeCun, Hinton, Bengio); textbook-level status
- GAN (105,000+ citations, 2014): Generative adversarial networks; an early landmark for AIGC image generation
- VGG (99,000+ citations, 2014): Classic CNN architecture; defined the design of deep vision networks
- Faster R-CNN (99,000+ citations, 2015): A milestone in object detection; extremely widely used in industry
- LeNet-5 (82,000+ citations, 1998): The original CNN; foundational work for convolutional neural networks
- Batch Normalization (70,000+ citations, 2015): A normalization technique that makes large-model training ~10× faster
- U-Net (70,000+ citations, 2015): Architectural foundation for image segmentation and diffusion models
- t-SNE (63,000+ citations, 2008): Data visualization technique; must-read for dimensionality reduction
- Dropout (60,000+ citations, 2014): The simplest effective way to prevent neural network overfitting
Graph Neural Networks (GNNs)
GNNs are a class of neural networks designed to process data represented as graphs. Unlike traditional neural networks that work on Euclidean data like images or sequences, GNNs can directly operate on irregular data structures by modeling relationships between nodes and edges. They leverage a message-passing framework where nodes update their feature representations by aggregating information from their neighbors. This process allows them to capture complex relational information and learn a continuous representation, or embedding, of the graph’s structure.
Foundational Knowledge
- Graph Neural Network - Wikipedia: Provides a great overview of the basic building blocks of GNNs, including permutation equivariant layers, local and global pooling, and a discussion on the expressive power of GNNs.
- Graph Neural Networks in TensorFlow: A resource from Google Research that explains how GNNs can be trained on large datasets using a stream of subgraphs. It also touches on both supervised and unsupervised training methods.
- Graph Neural Networks: A New Frontier: This blog post explains the core components of a GNN, such as node representation, message passing, and aggregation, and provides a useful comparison with traditional neural networks.
Frontier Research
- Graph Neural Networks: Foundation, Frontiers and Applications: This is a tutorial that covers a broad range of topics in GNNs, including fundamental concepts, new research frontiers, and emerging applications in fields like recommender systems and computer vision.
- Recent Research Progress of Graph Neural Networks in Computer Vision: A comprehensive review of GNN applications in computer vision, focusing on image processing, video analysis, and multimodal data fusion. It highlights how GNNs capture inter-region dependencies and spatiotemporal dynamics.
AI4Health
AI4Health (AI for Health) is an interdisciplinary field that applies machine learning and artificial intelligence to various aspects of healthcare. This includes tasks like medical image analysis, personalized treatment planning, drug discovery, and disease prediction. The field aims to improve the accuracy of diagnoses, enhance patient outcomes, and streamline clinical workflows.
Foundational Knowledge
- Artificial Intelligence In Health And Health Care: Priorities For Action: This paper provides a historical context for AI in healthcare and identifies key policy-related domains. It discusses the evolution from early symbolic representations to modern deep neural networks for tasks like digital imaging and diagnostic reasoning.
- (PDF) “Impact of Artificial Intelligence on Healthcare: A Review of Current Applications and Future Possibilities”: A review that clarifies the role of machine learning and natural language processing in healthcare. It covers applications in image analysis, diagnosis, and treatment planning, and discusses future possibilities like personalized medicine.
Frontier Research
- Evolution of Artificial Intelligence in Healthcare: a 30-year Bibliometric Study: A longitudinal study of AI publications in healthcare over three decades. It highlights the sustained growth in the field and the rise of new topics like COVID-19 analysis and new drug discovery.
- Frontiers on Healthcare Research: This is a journal that focuses on bridging the gap between healthcare research and clinical applications. It’s a good source for the latest peer-reviewed findings and strategies for improving patient care.
Limitations & Directions
- Data Dependency: AI models are often trained on specific datasets from a single hospital or clinic, which can lead to poor performance when applied to data from different institutions. A key challenge is developing models that are more robust and can generalize to diverse populations.
- Bias and Ethical Concerns: AI models can be biased due to the data they’re trained on, which can exacerbate healthcare disparities. The “black box” nature of many deep learning models makes it difficult to understand their decision-making process, leading to a lack of trust from both patients and clinicians.
- Clinical Integration: Many AI technologies show promise in research but have not been evaluated in clinical settings. Successful implementation requires seamless integration into existing workflows and proper training for healthcare professionals.
- Human-in-the-loop Systems: Currently, AI in healthcare should assist, not replace, doctors. Future research focuses on creating explainable AI (XAI) models that are transparent and trustworthy, and on developing systems where the final decision remains with a human clinician.
Target Audience
- Graduate students in ML/AI
Researchers e, Directions
- Data Dependency: AI models are often trained on specific datasets from a single hospital or clinic, which can lead to poor performance when applied to data from different institutions. A key challenge is developing models that are more robust and can generalize to diverse populations.
- Bias and Ethical Concerns: AI models can be biased due to the data they’re trained on, which can exacerbate healthcare disparities. The “black box” nature of many deep learning models makes it difficult to understand their decision-making process, leading to a lack of trust from both patients and clinicians.
- Clinical Integration: Many AI technologies show promise in research but have not been evaluated in clinical settings. Successful implementation requires seamless integration into existing workflows and proper training for healthcare professionals.
- Some thoughts on human-in-the-loop AI system: Currently, AI in healthcare should assist, not replace, doctors. Future research focuses on creating explainable AI (XAI) models that are transparent and trustworthy, and on developing systems where the final decision remains with a human clinician.
Agentic AI
Review/History:
WIP
AI4S
AI for Science. I used this GitHub repo for reference. Definitely check it out. AI4S papers have varying levels of specificity based on the field of science it’s tied to, I will try to categorize them into different subjects, e.g. AI4chemistry, AI4physics, and more.
Here are some overviews on AI4S as a field to get you started:
- SAIBench: A Structural Interpretation of AI for Science Through Benchmarks
- Bridging AI and Science: Implications from a Large-Scale Literature Analysis of AI4Science
Physics
Physics-informed Neural Networks (PINNs)
Physics-informed neural networks (PINNs) are a type of neural network that incorporates the laws of physics into their training process. They do this by embedding governing equations, like partial differential equations (PDEs), into the neural network’s loss function. This allows PINNs to find solutions to physical problems while requiring less training data than traditional data-driven models. They are particularly useful for solving forward and inverse problems in computational science.
Foundational Knowledge, Start with These
- Physics-informed Neural Networks - Wikipedia: A great starting point that defines PINNs, explains their function approximation capabilities, and covers different types of problems they can solve, such as data-driven solution and discovery of PDEs.
- Physics-Informed Neural Networks for Inverse PDE Problems: This article provides an intuitive explanation of PINNs as a “cheat sheet” for a regular neural network, outlining how they use automatic differentiation and a physics-based loss function to achieve better results with less data.
- Adaptive Physics-informed Neural Networks: A review paper that discusses how advanced ML techniques, like transfer learning and meta-learning, can be integrated into PINNs to improve model adaptivity and address convergence challenges.
Frontier Research
- Physics-Informed Neural Networks: A Review of Methodological Evolution, Theoretical Foundations, and Interdisciplinary Frontiers Toward Next-Generation Scientific Computing: A comprehensive review paper that establishes a framework for understanding PINNs, covering methodological innovations, theoretical breakthroughs, and cross-disciplinary applications. It also proposes a roadmap for “PINN 2.0” including neuro-symbolic integration and quantum-accelerated optimization.
- Physics-Informed Neural Networks with Hard Constraints for Inverse Design: This paper introduces a method for solving topology optimization problems using PINNs with hard constraints, which is a significant advancement for engineering design.
Notes:
- Training PINNs can be computationally expensive, especially for complex or multi-physics problems.
- Optimizing PINNs can be difficult, and their convergence properties are not as well understood as traditional methods.
- PINNs can struggle to solve high-frequency or multiscale problems and to incorporate noisy, sparse real-world data effectively.
- A key area of future research is improving PINN performance by integrating them with other methods, such as domain decomposition techniques (e.g., XPINNs), and exploring novel optimization strategies. The “PINN 2.0” vision focuses on neuro-symbolic integration, federated physics learning, and quantum-accelerated optimization.
Embodied AI
Embodied AI is the integration of artificial intelligence into physical systems—such as robots, autonomous vehicles, and smart industrial machines—enabling them to perceive, reason, and act in the real world. Unlike traditional AI that operates purely in digital environments, embodied AI bridges the gap between computation and physical interaction through machine learning, computer vision, and sensor fusion.
Introduction/Overview of Embodied AI:
Theory of Mind Inference:
WIP
Artificial General Intelligence (AGI)
WIP
Not Yet Categorized
Liquid Neural Networks (LNNs)
Liquid Neural Networks (LNNs) are a type of artificial intelligence model that utilize a dynamic architecture inspired by biological neurons. They are designed to continuously learn and adapt in real-time, even after deployment, similar to how the human brain learns from experience.
Latest Papers:
- Comparative Analysis Between Liquid Neural Networks (LNNs) and Recurrent Neural Networks (RNNs)
- Liquid Neural Networks: Next-Generation AI for Telecom from First Principles
Neural Tangent Kernels (NTKs)
Start with the following:
- Lilian Weng’s Blog on Math Behind NTKs
- Understanding the Neural Tangent Kernel
- Neural Tangent Kernel, Applied Probability Notes
- Prior’s for Infinite Networks
If you’re interested in the functional analysis foundations of NTKs:
Now you should be ready for the NTK foundational papers:
- Neural Tangent Kernel: Convergence and Generalization in Neural Networks, NeurIPS 2018
- Deep Neural Networks as Gaussian Processes, ICLR 2018
- On Lazy Training in Differentiable Programming
O(3)-Equivariant Deep Networks
O(3) is the group of all rotations and reflections in 3D space. It’s a mathematical concept that describes how objects can be moved in 3D without changing their fundamental properties (like distances between points).
An equivariant neural network, on the other hand, is one where the output changes in a specific, predictable way when the input is transformed by a group operation (like rotation or reflection). In simpler terms, if you rotate the input, the output will also be rotated in a corresponding manner.
Why are O(3) equivariant networks important? This is because they can learn generalizable features from 3D data, even if the training data only contains examples in a limited number of orientations. This reduces the amount of training data needed. Also, for tasks involving 3D objects, like predicting molecular properties or simulating physical systems, O(3)-equivariance ensures that the model’s predictions are consistent with the laws of physics and geometry. They are also better at generalizing to unseen orientations of the input data.
Mathematical Foundation on 3D Rotation Groups (Group Theory, Linear Algebra, Rotation Matrices):
- 3D Rotation Group - a 3D rotation group is a type of orthogonal group
- Orthogonal Group
- 3D Rotations - Gabriel Taubin
Latest Papers:
- Unifying O(3) Equivariant Neural Networks Design with Tensor-Network Formalism
- An Efficient Sparse Kernel Generator for O(3)-Equivariant Deep Networks
Real-time Adaptation Laws for Neural Networks
Mathematical Foundations:
- A college course in Differential equations
- Intermediate ODEs textbook: Arnold’s Ordinary Differential Equations
- Dynamical systems
- Nonlinear dynamics is helpful too. I recommend Prof. Strogatz’s Nonlinear Dynamics and Chaos
Concept Definitions More Pertinent to Understanding Latest Research:
Latest Papers:
- A Tutorial on a Lyapunov-Based Approach to the Analysis of Iterative Optimization Algorithms
- Lyapunov-Based Real-Time and Iterative Adjustment of Deep Neural Networks, IEEE Control Systems Letters
Model Compression Techniques Through Numerical Linear Algebra
Mathematical Foundation:
- Linear algebra
- Numerical Analysis, Numerical Linear Algebra, Trefethen and Bau, Chapter I - V
Latest Papers:
