Research Interests
My research focuses on developing machine learning methods for genomic sequence analysis and biological discovery. Specific areas include:
- Machine Learning in Genomics: interpretable variant detection, genotype-phenotype associations, feature selection, and statistical inference in high-dimensional genomic data
- Language Models for Biology: genomic language models for learning evolutionary and functional information from DNA/RNA sequences
- Multimodal Learning: integrating genomic data with phenotypic, clinical, omics, and environmental information
- Taxonomic Classification: machine learning methods for microbial identification with probabilistic predictions and uncertainty quantification
- Computational Biology Tools: developing open-source software for genomic analysis