Data Quality Management of Black Box Monitored Driving Behaviors in a Sleep Apnea Study

Saturday, October 29, 2011
Hall 1-2 (San Jose Convention Center)
Amy Ruiz Goyco , Department of Computer Science, University of Puerto Rico, San Juan, PR
Jeffrey D. Dawson, Sc.D. , Department of Biostatistic, University of Iowa, Iowa City, IA
Kayse Lee , Department of Mathematics, Bethel University, St. Paul, MN
Nicole Lovell , Department of Mathematics, Hollins University, Roanoke, VA
In a study of drivers with obstructive sleep apnea, researchers in the Neuroergonomics Division at the University of Iowa are outfitting participants' personal vehicles with "black box" devices to measure and record high-frequency data for 3-month periods of driving (NIH R01 Award HL091917).  Examples of such naturalistic data include latitude, longitude, road speed, and 3-dimensional acceleration.  Since the driving variables are collected at approximately 10 rows per second, the full dataset for a single subject often has millions of observations that need to be reduced and summarized in meaningful ways, and merged with other sources of data.  The goals of this study were a) to describe the driving habits in the first several participants in this study, b) to find and resolve data quality control issues which have arisen so far in this study, c) to ascertain how driving habits are affected by weather conditions and other environmental factors, and d) to set up programming templates which can be used in future analyses as other drivers are enrolled into this study.  To meet these study goals, we used the R statistical software to read in the driving data, create plots, merge environmental variables into the data set, perform exploratory data analysis, reduce data to the day-level, and test specific hypotheses.