SAT-507 Using Machine Learning to Predict Level of Alcohol Consumption from Facebook

Saturday, October 13, 2012: 12:20 PM
Hall 4E/F (WSCC)
Alexandra Larsen , Biostatistics/Medical Informatics, University of Wisconsin, Madison, Madison, WI
Eric Lantz , Computer Science, University of Wisconsin, Madison, Madison, WI
David Page, PhD , Biostatistics/Medical Informatics and Computer Science, University of Wisconsin, Madison, Madison, WI
Megan Moreno, MD, MSEd, MPH , Pediatric and Adolescent Medicine at UW Health, University of Wisconsin. Madison, Madsion, WI
Social Networking Sites (SNSs) serve not only as easy modes of communication between large groups of people, but also as massive databases of information about each user.  Machine learning methods developed to mine public data on SNSs can be utilized in research on human behavior and health.  This project in particular aims to use a machine learning algorithm and SNS data to predict levels of alcohol use and abuse.  Preliminary results have confirmed a direct relationship between the frequency of alcohol references on a given individual’s Facebook profile and that individual’s score on the Alcohol Use Disorders Identification Test (AUDIT), indicating a connection between references to alcohol on Facebook and problem drinking.  While a very helpful finding, the current method of combing through Facebook data by hand is highly  time-consuming and requires training.  This project aims to automate the Facebook data mining process via a machine learning algorithm and predict the AUDIT score of an individual. 

The project design is as follows: download Facebook information of subjects participating in the study; store the text from each page while minimizing the ability to re-identify any of the subjects or other Facebook users;  finally,  we expect to find the algorithm by running a linear regression through the text and the collected AUDIT scores.  The results will be tested by statistical metrics and via cross-validation.

Potentially, this machine learning technique could be used to screen Facebook users for health-related issues beyond alcoholism or other behavioral trends in general.