CS & Applied Math Papers

Published:

Overview

Latest CS & math papers with a focus on ML/AI and some supplemental resources. Curated with a focus on:

  1. Building Foundational Knowledge: Resources to understand the mathematical and conceptual foundations for the latest research topics
  2. Understanding Frontier Research: Latest papers pushing the boundaries of each field
  3. Identifying Limitations & Directions: Critical analysis of current challenges and future research opportunities

Created by Nicole Hao, 2025


Table of Contents

  • AI Foundations, Must Reads
  • Graph Neural Networks (GNNs)
  • AI4Health
  • Agentic AI
  • AI4S
    • Physics - Physics-informed Neural Networks (PINNs)
  • Embodied AI
  • Artificial General Intelligence (AGI)
  • Not Yet Categorized
    • Liquid Neural Networks (LNNs)
    • Neural Tangent Kernels (NTKs)
    • O(3)-Equivariant Deep Networks
    • Real-time Adaptation Laws for Neural Networks
    • Model Compression Techniques Through Numerical Linear Algebra

AI Foundations, Must Reads

  • ResNet (297,000+ citations, 2016): A cornerstone of deep learning; ‘solved’ the difficulty of training very deep networks
  • Adam (200,000+ citations, 2014): The most widely used optimizer; nearly all large models are trained with it
  • AlexNet (187,000+ citations, 2012): The starting point of the deep learning boom; kicked off GPU-based training
  • Attention (Transformer) (173,000+ citations, 2017): The “bible” of large models; the ancestor of all LLMs like ChatGPT
  • LSTM (140,000+ citations, 1997): A pioneer of sequence modeling; dominated NLP for 20 years
  • BERT (120,000+ citations, 2018): Introduced the “pretraining + fine-tuning” paradigm for large models
  • Deep Learning (Review) (106,000+ citations, 2015): A review by the three giants (LeCun, Hinton, Bengio); textbook-level status
  • GAN (105,000+ citations, 2014): Generative adversarial networks; an early landmark for AIGC image generation
  • VGG (99,000+ citations, 2014): Classic CNN architecture; defined the design of deep vision networks
  • Faster R-CNN (99,000+ citations, 2015): A milestone in object detection; extremely widely used in industry
  • LeNet-5 (82,000+ citations, 1998): The original CNN; foundational work for convolutional neural networks
  • Batch Normalization (70,000+ citations, 2015): A normalization technique that makes large-model training ~10× faster
  • U-Net (70,000+ citations, 2015): Architectural foundation for image segmentation and diffusion models
  • t-SNE (63,000+ citations, 2008): Data visualization technique; must-read for dimensionality reduction
  • Dropout (60,000+ citations, 2014): The simplest effective way to prevent neural network overfitting

Graph Neural Networks (GNNs)

GNNs are a class of neural networks designed to process data represented as graphs. Unlike traditional neural networks that work on Euclidean data like images or sequences, GNNs can directly operate on irregular data structures by modeling relationships between nodes and edges. They leverage a message-passing framework where nodes update their feature representations by aggregating information from their neighbors. This process allows them to capture complex relational information and learn a continuous representation, or embedding, of the graph’s structure.

Foundational Knowledge

  • Graph Neural Network - Wikipedia: Provides a great overview of the basic building blocks of GNNs, including permutation equivariant layers, local and global pooling, and a discussion on the expressive power of GNNs.
  • Graph Neural Networks in TensorFlow: A resource from Google Research that explains how GNNs can be trained on large datasets using a stream of subgraphs. It also touches on both supervised and unsupervised training methods.
  • Graph Neural Networks: A New Frontier: This blog post explains the core components of a GNN, such as node representation, message passing, and aggregation, and provides a useful comparison with traditional neural networks.

Frontier Research


AI4Health

AI4Health (AI for Health) is an interdisciplinary field that applies machine learning and artificial intelligence to various aspects of healthcare. This includes tasks like medical image analysis, personalized treatment planning, drug discovery, and disease prediction. The field aims to improve the accuracy of diagnoses, enhance patient outcomes, and streamline clinical workflows.

Foundational Knowledge

Frontier Research

Limitations & Directions

  • Data Dependency: AI models are often trained on specific datasets from a single hospital or clinic, which can lead to poor performance when applied to data from different institutions. A key challenge is developing models that are more robust and can generalize to diverse populations.
  • Bias and Ethical Concerns: AI models can be biased due to the data they’re trained on, which can exacerbate healthcare disparities. The “black box” nature of many deep learning models makes it difficult to understand their decision-making process, leading to a lack of trust from both patients and clinicians.
  • Clinical Integration: Many AI technologies show promise in research but have not been evaluated in clinical settings. Successful implementation requires seamless integration into existing workflows and proper training for healthcare professionals.
  • Human-in-the-loop Systems: Currently, AI in healthcare should assist, not replace, doctors. Future research focuses on creating explainable AI (XAI) models that are transparent and trustworthy, and on developing systems where the final decision remains with a human clinician.

Target Audience

  • Graduate students in ML/AI
  • Researchers e, Directions

  • Data Dependency: AI models are often trained on specific datasets from a single hospital or clinic, which can lead to poor performance when applied to data from different institutions. A key challenge is developing models that are more robust and can generalize to diverse populations.
  • Bias and Ethical Concerns: AI models can be biased due to the data they’re trained on, which can exacerbate healthcare disparities. The “black box” nature of many deep learning models makes it difficult to understand their decision-making process, leading to a lack of trust from both patients and clinicians.
  • Clinical Integration: Many AI technologies show promise in research but have not been evaluated in clinical settings. Successful implementation requires seamless integration into existing workflows and proper training for healthcare professionals.
  • Some thoughts on human-in-the-loop AI system: Currently, AI in healthcare should assist, not replace, doctors. Future research focuses on creating explainable AI (XAI) models that are transparent and trustworthy, and on developing systems where the final decision remains with a human clinician.

Agentic AI

Review/History:

WIP


AI4S

AI for Science. I used this GitHub repo for reference. Definitely check it out. AI4S papers have varying levels of specificity based on the field of science it’s tied to, I will try to categorize them into different subjects, e.g. AI4chemistry, AI4physics, and more.

Here are some overviews on AI4S as a field to get you started:

Physics

Physics-informed Neural Networks (PINNs)

Physics-informed neural networks (PINNs) are a type of neural network that incorporates the laws of physics into their training process. They do this by embedding governing equations, like partial differential equations (PDEs), into the neural network’s loss function. This allows PINNs to find solutions to physical problems while requiring less training data than traditional data-driven models. They are particularly useful for solving forward and inverse problems in computational science.

Foundational Knowledge, Start with These

  • Physics-informed Neural Networks - Wikipedia: A great starting point that defines PINNs, explains their function approximation capabilities, and covers different types of problems they can solve, such as data-driven solution and discovery of PDEs.
  • Physics-Informed Neural Networks for Inverse PDE Problems: This article provides an intuitive explanation of PINNs as a “cheat sheet” for a regular neural network, outlining how they use automatic differentiation and a physics-based loss function to achieve better results with less data.
  • Adaptive Physics-informed Neural Networks: A review paper that discusses how advanced ML techniques, like transfer learning and meta-learning, can be integrated into PINNs to improve model adaptivity and address convergence challenges.

Frontier Research

Notes:

  • Training PINNs can be computationally expensive, especially for complex or multi-physics problems.
  • Optimizing PINNs can be difficult, and their convergence properties are not as well understood as traditional methods.
  • PINNs can struggle to solve high-frequency or multiscale problems and to incorporate noisy, sparse real-world data effectively.
  • A key area of future research is improving PINN performance by integrating them with other methods, such as domain decomposition techniques (e.g., XPINNs), and exploring novel optimization strategies. The “PINN 2.0” vision focuses on neuro-symbolic integration, federated physics learning, and quantum-accelerated optimization.

Embodied AI

Embodied AI is the integration of artificial intelligence into physical systems—such as robots, autonomous vehicles, and smart industrial machines—enabling them to perceive, reason, and act in the real world. Unlike traditional AI that operates purely in digital environments, embodied AI bridges the gap between computation and physical interaction through machine learning, computer vision, and sensor fusion.

Introduction/Overview of Embodied AI:

Theory of Mind Inference:

WIP


Artificial General Intelligence (AGI)

WIP


Not Yet Categorized

Liquid Neural Networks (LNNs)

Liquid Neural Networks (LNNs) are a type of artificial intelligence model that utilize a dynamic architecture inspired by biological neurons. They are designed to continuously learn and adapt in real-time, even after deployment, similar to how the human brain learns from experience.

Latest Papers:

Neural Tangent Kernels (NTKs)

Start with the following:

If you’re interested in the functional analysis foundations of NTKs:

Now you should be ready for the NTK foundational papers:

O(3)-Equivariant Deep Networks

O(3) is the group of all rotations and reflections in 3D space. It’s a mathematical concept that describes how objects can be moved in 3D without changing their fundamental properties (like distances between points).

An equivariant neural network, on the other hand, is one where the output changes in a specific, predictable way when the input is transformed by a group operation (like rotation or reflection). In simpler terms, if you rotate the input, the output will also be rotated in a corresponding manner.

Why are O(3) equivariant networks important? This is because they can learn generalizable features from 3D data, even if the training data only contains examples in a limited number of orientations. This reduces the amount of training data needed. Also, for tasks involving 3D objects, like predicting molecular properties or simulating physical systems, O(3)-equivariance ensures that the model’s predictions are consistent with the laws of physics and geometry. They are also better at generalizing to unseen orientations of the input data.

Mathematical Foundation on 3D Rotation Groups (Group Theory, Linear Algebra, Rotation Matrices):

Latest Papers:

Real-time Adaptation Laws for Neural Networks

Mathematical Foundations:

Concept Definitions More Pertinent to Understanding Latest Research:

Latest Papers:

Model Compression Techniques Through Numerical Linear Algebra

Mathematical Foundation:

Latest Papers: