Saturday, October 13, 2012: 12:20 PM
Hall 4E/F (WSCC)
Social Networking Sites (SNSs) serve not only as easy modes of communication between large groups of people, but also as massive databases of information about each user. Machine learning methods developed to mine public data on SNSs can be utilized in research on human behavior and health. This project in particular aims to use a machine learning algorithm and SNS data to predict levels of alcohol use and abuse. Preliminary results have confirmed a direct relationship between the frequency of alcohol references on a given individual’s Facebook profile and that individual’s score on the Alcohol Use Disorders Identification Test (AUDIT), indicating a connection between references to alcohol on Facebook and problem drinking. While a very helpful finding, the current method of combing through Facebook data by hand is highly time-consuming and requires training. This project aims to automate the Facebook data mining process via a machine learning algorithm and predict the AUDIT score of an individual.
The project design is as follows: download Facebook information of subjects participating in the study; store the text from each page while minimizing the ability to re-identify any of the subjects or other Facebook users; finally, we expect to find the algorithm by running a linear regression through the text and the collected AUDIT scores. The results will be tested by statistical metrics and via cross-validation.
Potentially, this machine learning technique could be used to screen Facebook users for health-related issues beyond alcoholism or other behavioral trends in general.