Friday, October 12, 2012: 8:00 PM
6C/6E (WSCC)
Advances in the state of devices such as magnetic resonance imaging machines have resulted in an increase in the amount of data, enabling new scientific discoveries to be made. However, with the increase in data comes the need for more compute power in order to process the data in a timely manner. This results in an entry barrier for scientists/professionals with modest budgets for and/or limited knowledge of deploying and maintaining the necessary computing resources. Cloud computing, in which computing resources are consumed as utilities, seeks to address this problem by eliminating the entry costs for doing computationally-intensive science and making resources available to a large user base. The challenge for cloud computing service providers is to maximize the value provided by their services, which implies that they must optimize their return on investment. Additionally, they must address the time sensitive nature of the workloads submitted by users who have deadlines. In this work, we look at this issue from a provider's perspective by developing a deadline-driven job scheduling methodology that addresses the needs of users while optimizing the provider's return on investment. As part of this effort, we analyze the performance of computationally-intensive applications on a compute cluster with virtualized nodes and integrate a performance prediction methodology to determine job schedules that satisfy deadlines while optimizing system utilization. We compare our scheduling algorithm to others by processing a workload consisting of 66 medical image processing jobs and observe that it provides better resource utilization while not missing any deadlines.