I am interested in developing and applying statistical methods that address important public health and scientific issues. So far, my work has focused on methods and applications studying the health effects of air pollution in vulnerable populations such as children and older adults. I predominantly work with data from large observational cohort studies.
Statistical interests:
Longitudinal models, multilevel models, effect modification, integrated models, latent variables, scaling regression coefficients, reproducible research, modern prediction techniques (machine learning)
Substantive interests:
Environmental epidemiology, biomarkers, aging, exhaled nitric oxide, asthma, traffic-related pollution, cancer, mHealth, wearable sensors
My PhD thesis (advisor: Tom Louis) was composed of three parts. In the first project, "Identifying effect modifiers in air pollution time-series studies using a two-stage analysis", we developed and implemented a method to systematically identify potential effect modifiers of the relationship between daily mortality counts and daily levels of ambient particulate matter in a subset of cities in the National Morbidity and Mortality Air Pollution study. In the second project, "Surrogate models for the low physical activity criterion of frailty in older adults" we developed and implemented methods to: (a) streamline assessement and (b) fill in missing values for physical activity, one of 5 criteria used in a common clinical phenotype of generontologic fraily, in the Cardiovascular Health Study. In the third project, "Modification by frailty status of ambient air pollution effects on lung function in older adults in the Cardiovascular Health Study", we related longitudinal assessments of lung function to either: (a) recent summaries of pollution and current gerontologic frailty status or (b) cumulative summaries of pollution and frailty status history.
As statistical analyses become increasingly complex, we have a growing need to accurately reproduce results. While at JHSPH, I worked with Dr. Roger Peng to create an R package called 'stashR' (A Set of Tools for Administering SHared Repositories) which is part of a toolkit for conducting and distributing reproducible research, particularly in the context of the Sweave system to interweave code (in R) and text (in LaTeX).