Research Interests
I am interested in bioinformatics, molecular evolution, and molecular population genetics. Owing to many genome projects, almost infinite amount of molecular data is becoming available. They are filled with evolutionary footprints. My interest revolves around mining such information from sequence data, reconstructing the evolutionary process of sequences, genes, and genomes, and applying knowledge we gain from these analyses for protein function prediction and gene mining. Current projects include: 1) development of alignment-free classification methods that can be efficiently and accurately applied to G-protein coupled receptor and other extremely divergent protein superfamily, 2) mining these proteins from diverse genomes, 3) molecular evolutionary analysis of GPCR and other extremely divergent protein superfamily to understand the evolutionary mechanisms of these protein functions, 4) development of sequence simulation methods that incorporate dynamic protein evolution including insertion/deletion, domain structure and shuffling, and 5) improvement of phylogenetic and multiple alignment methods for divergent protein evolution. I am also interested in synonymous codon usage. Although synonymous codon substitutions do not change amino acids, they are not neutral to selection. It makes synonymous codon usage a unique and informative quantity for studying molecular evolution at different levels. One of my goals is to incorporate such information in bioinformatics tools and achieve thorough and multi-dimensional understanding of genomic data.