Dr. Chieh Wu - Machine Learning Researcher & Professor

Research

My research focuses on advancing machine learning theory and applications, with a specific emphasis on kernel methods and deep learning. I'm passionate about discovering a unifying mathematical theory for neural networks and exploring interdisciplinary applications of machine learning.

Kernel Methods

Pro. Wu is one of the original researchers on interpretable kernel methods. With a focus on the Hilbert Schmidt Independence Criterion (HSIC), Pro. Wu discovered the optimization technique of ISM (Iterative Spectral Method).

Neural Network Theory

In 2022, Prof. Wu made the discovery that neural networks can have a closed-form solution by changing the activation function from ReLU to cosines. Prof. Wu is currently looking a a general closed form solution.

Health Informatics

Applying machine learning to identify personalized risk factors and predict gestational age through a grant from the Opportunity and Infrastructure Fund.

Environmental Data Analysis

Developing machine learning approaches for analyzing environmental data, including single-cell Raman spectroscopy for bacterial taxonomy identification.

Feature Grouping

Researching instance-wise feature grouping methodologies to enhance interpretability and performance of complex machine learning models.

Lyme Bacteria Research

Prof. Wu is currently collaborating with the Biology team to better understand the Lyme bacteria structure.

Publications

As a machine learning scholar, I've published papers in top conferences and journals, with a focus on kernel methods, deep learning theory, and interdisciplinary applications.

2022

Deep Layer-wise Networks Have Closed-Form Weights

Abstract: There is currently a debate within the neuroscience community over the likelihood of the brain performing backpropagation (BP). To better mimic the brain, training a network one layer at a time with only a "single forward pass" has been proposed as an alternative to bypass BP; we refer to these networks as "layer-wise" networks. We continue the work on layer-wise networks by answering two outstanding questions. First, do they have a closed-form solution? Second, how do we know when to stop adding more layers? This work proves that the Kernel Mean Embedding is the closed-form weight that achieves the network global optimum while driving these networks to converge towards a highly desirable kernel for classification; we call it the Neural Indicator Kernel.

Chieh Tzu Wu, Aria Masoomi, Arthur Gretton, Jennifer G. Dy (2022)

AISTATS, 188-225

View Publication Watch Video

2020

Instance-wise Feature Grouping

Abstract: In many learning problems, the domain scientist is often interested in discovering the groups of features that are redundant and are important for classification. Moreover, the features that belong to each group, and the important feature groups may vary per sample. But what do we mean by feature redundancy? In this paper, we formally define two types of redundancies using information theory: Representation and Relevant redundancies. We leverage these redundancies to design a formulation for instance-wise feature group discovery and reveal a theoretical guideline to help discover the appropriate number of groups. We approximate mutual information via a variational lower bound and learn the feature group and selector indicators with Gumbel-Softmax in optimizing our formulation. Experiments on synthetic data validate our theoretical claims. Experiments on MNIST, Fashion MNIST, and gene expression datasets show that our method discovers feature groups with high classification accuracies.

Aria Masoomi, Chieh Wu, Tingting Zhao, Zifeng Wang, Peter J. Castaldi, Jennifer G. Dy (2020)

NeurIPS

View Publication Watch Video

Using Undersampling with Ensemble Learning to Identify Factors Contributing to Preterm Birth

Abstract: This paper presents ensemble learning methods to identify key factors contributing to preterm births, with potential applications in preventive healthcare. Our work leverages a rich dataset collected by a NIEHS P42 Center that is trying to identify the dominant factors responsible for the high rate of premature births in northern Puerto Rico. We investigate analytical models addressing major challenges in this domain, using undersampling techniques combined with ensemble learning to improve identification of relevant risk factors.

Shi Dong, Zlatan Feric, Guangyu Li, Chieh Wu, April Z. Gu, Jennifer G. Dy, John Meeker, Ingrid Y. Padilla, José Cordero, Carmen Velez Vega, Zaira Rosario, Akram Alshawabkeh, David R. Kaeli (2020)

ICMLA, 759-764

View Publication Watch Video

2019

Solving Interpretable Kernel Dimensionality Reduction

Abstract: This research focuses on interpretable kernel dimensionality reduction methods, exploring how to make complex dimensionality reduction techniques more understandable and explainable. The paper presents novel approaches to solving kernel-based dimensionality reduction while maintaining interpretability, which is crucial for applications in domains where understanding the transformation process is as important as the reduction itself.

Chieh Wu, Jared Miller, Yale Chang, Mario Sznaier, Jennifer Dy (2019)

NeurIPS

View Publication Watch Video

2025

Towards High-Accuracy Bacterial Taxonomy Identification Using Phenotypic Single-Cell Raman Spectroscopy (SCRS) Data

Abstract: Single-cell Raman Spectroscopy (SCRS) emerges as a promising tool for single-cell phenotyping in environmental ecological studies, offering non-intrusive, high-resolution, and high-throughput capabilities. In this study, we obtained a large and comprehensive SCRS dataset that captured phenotypic variations with cell growth status for various bacterial species. We apply machine learning approaches to analyze this data for bacterial taxonomy identification, demonstrating high accuracy in classifying bacterial species based on their Raman spectroscopy signatures.

Guangyu Li, Zijian Wang, Chieh Wu, April Z. Gu (2025)

View Publication Watch Video

Machine Learning-Based Determination of Sampling Depth for Complex Environmental Systems

Abstract: Rapid progress in various advanced analytical methods such as single-cell technologies enable unprecedented and deeper understanding of microbial ecology beyond the resolution of conventional approaches. A major application challenge exists in the determination of sufficient sample size without sufficient prior knowledge of the community complexity. This case study focuses on using single-cell Raman spectroscopy data in Enhanced Biological Phosphorus Removal (EBPR) systems, developing machine learning approaches to determine optimal sampling strategies for complex environmental systems.

Chieh Wu, April Z. Gu, et al. (2025)

View Publication Watch Video

Chieh Wu, Ph.D.

Research

Kernel Methods

Neural Network Theory

Health Informatics

Environmental Data Analysis

Feature Grouping

Lyme Bacteria Research

Publications

2022

Deep Layer-wise Networks Have Closed-Form Weights

2020

Instance-wise Feature Grouping

Using Undersampling with Ensemble Learning to Identify Factors Contributing to Preterm Birth

2019

Solving Interpretable Kernel Dimensionality Reduction

2025

Towards High-Accuracy Bacterial Taxonomy Identification Using Phenotypic Single-Cell Raman Spectroscopy (SCRS) Data

Machine Learning-Based Determination of Sampling Depth for Complex Environmental Systems

Educational Videos

Introduction to Machine Learning

Kernel Machine Learning

Mathematical Foundations for Machine Learning

Teaching

DS4400: Machine Learning & Data Mining

Advanced Topics in Machine Learning

Mathematical Foundations of Machine Learning

Contact

Personal Interests

Basketball

Pickleball

Piano