My primary research interest lies in understanding the underlying mechanisms and implicit inductive biases of machine learning models and algorithms. I aspire to contribute to the development of robust and interpretable machine learning systems, designed to excel in challenging conditions.
National Yang Ming Chiao Tung University Doctor of Medicine
Aug. 15 - Jun. 22
News
[2023/11] One paper on biology LLM submitted to LLMs4Bio AAAI'24.
[2023/08] Begin PhD journey at Cornell.
[2023/05] One paper on Federated Learning submitted to arXiv.
[2022/06] Pass the Taiwan Medical Licensing Examination.
[2021/10] One paper accepted at ICLR'22 as poster.
[2021/10] One paper accepted at NewInML Workshop, NeurIPS'21 as oral paper.
[2021/06] One paper accepted at MICCAI'21 as oral paper.
Publications
FedBug: A Bottom-Up Gradual Unfreezing Framework for Federated Learning Chia-Hsiang Kao, Yu-Chiang Frank Wang
| abstract |
arxiv |
github |
Submitted to arXiv.
Federated Learning (FL) offers a collaborative training framework, allowing multiple clients to contribute to a shared model without compromising data privacy. Due to the heterogeneous nature of local datasets, updated client models may overfit and diverge from one another, commonly known as the problem of client drift. In this paper, we propose FedBug (Federated Learning with Bottom-Up Gradual Unfreezing), a novel FL framework designed to effectively mitigate client drift. FedBug adaptively leverages the client model parameters, distributed by the server at each global round, as the reference points for cross-client alignment. Specifically, on the client side, FedBug begins by freezing the entire model, then gradually unfreezes the layers, from the input layer to the output layer. This bottom-up approach allows models to train the newly thawed layers to project data into a latent space, wherein the separating hyperplanes remain consistent across all clients. We theoretically analyze FedBug in a novel over-parameterization FL setup, revealing its superior convergence rate compared to FedAvg. Through comprehensive experiments, spanning various datasets, training conditions, and network architectures, we validate the efficacy of FedBug. Our contributions encompass a novel FL framework, theoretical analysis, and empirical validation, demonstrating the wide potential and applicability of FedBug.
MAML Is a Noisy Contrastive Learner in Classification Chia-Hsiang Kao, Wei-Chen Chiu, Pin-Yu Chen
| abstract |
arxiv |
poster |
github |
paper explained |
Accepted by ICLR'22 as poster.
Accepted by NeurIPS'21 workshop as oral presentation.
Model-agnostic meta-learning (MAML) is one of the most popular and widely-adopted meta-learning algorithms nowadays, which achieves remarkable success in various learning problems. Yet, with the unique design of nested inner-loop and outer-loop updates which respectively govern the task-specific and meta-model-centric learning, the underlying learning objective of MAML still remains implicit and thus impedes a more straightforward understanding of it. In this paper, we provide a new perspective to the working mechanism of MAML and discover that: MAML is analogous to a meta-learner using a supervised contrastive objective function, where the query features are pulled towards the support features of the same class and against those of different classes, in which such contrastiveness is experimentally verified via an analysis based on the cosine similarity. Moreover, our analysis reveals that the vanilla MAML algorithm has an undesirable interference term originating from the random initialization and the cross-task interaction. We therefore propose a simple but effective technique, zeroing trick, to alleviate such interference, where the extensive experiments are then conducted on both miniImagenet and Omniglot datasets to demonstrate the consistent improvement brought by our proposed technique thus well validating its effectiveness.
Demystifying T1-MRI to FDG18-PET Image Translation via Representational Similarity Chia-Hsiang Kao, Yong-Sheng Chen, Li-Fen Chen, Wei-Chen Chiu
| abstract |
paper|
Accepted by MICCAI'21 as oral representation.
Earned the Student Travel Award in MICCAI'21.
Recent development of image-to-image translation techniques has enabled the generation of rare medical images (e.g., PET) from common ones (e.g., MRI). Beyond the potential benefits of the reduction in scanning time, acquisition cost, and radiation exposure risks, the translation models in themselves are inscrutable black boxes. In this work, we propose two approaches to demystify the image translation process, where we particularly focus on the T1-MRI to PET translation. First, we adopt the representational similarity analysis and discover that the process of T1-MR to PET image translation includes the stages of brain tissue segmentation and brain region recognition, which unravels the relationship between the structural and functional neuroimaging data. Second, based on our findings, an Explainable and Simplified Image Translation (ESIT) model is proposed to demonstrate the capability of deep learning models for extracting gray matter volume information and identifying brain regions related to normal aging and Alzheimer's disease, which untangles the biological plausibility hidden in deep learning models.
Unravelling the Spatio-Temporal Neurodynamics of Rhythm Encoding-Reproduction Networks by a Novel fMRI Autoencoder Chia-Hsiang Kao, Ching-Ju Yang, Li-Kai Cheng, Hsin-Yen Yu, Yong-Sheng Chen, Jen-Chuen Hsieh, and Li-Fen Chen
| abstract |
paper |
Accepted by NER'19 as poster.
Visualization of how the external stimuli are processed dynamically in the brain would help understanding the neural mechanisms of functional segregation and integration. The present study proposed a novel temporal autoencoder to estimate the neurodynamics of functional networks involved in rhythm encoding and reproduction. A fully-connected two-layer autoencoder was proposed to estimate the temporal dynamics in functional magnetic resonance image recordings. By minimizing the reconstruction error between the predicted next time sample and the corresponding ground truth next time sample, the system was trained to extract spatial patterns of functional network dynamics without any supervision effort. The results showed that the proposed model was able to extract the spatial patterns of task-related functional dynamics as well as the interactions between them. Our findings suggest that artificial neural networks would provide a useful tool to resolve temporal dynamics of neural processing in the human brain.
Awards and Scholarships
[2021/06] Student Travel Award, MICCAI'21. (To reward the best, e.g. highest scoring, first author students)
[2020/08] Undergraduate Research Fellowship, National Science and Technology Council, Taiwan.
[2018/08] Undergraduate Research Fellowship, National Science and Technology Council, Taiwan.
[2018/06] Summer Research Fellowship, National Health Research Institutes and the Foundation of Health Sciences, Taiwan.
Services
[2022/08] Reviewer, Computer Vision and Image Understanding.
[2022/04] Reviewer, AutoML'22 Conference.
[2021/09] Junior Reviewer, Workshop on Meta-Learning, NeurIPS'21.
Writings
[2022/03]
ENPaper Explained — MAML Is a Noisy Contrastive Learner in Classification
[2021/10]
ENWhen a Man in the White Coat Codes. (II)
[2021/09]
ENOn Two Perspectives of Contrastive Divergence Algorithm.