Room 6C/6E Data Mining For Laser-Induced Breakdown Spectroscopy Data

Friday, October 12, 2012: 8:00 PM
6C/6E (WSCC)
Tia Vance, PhD , Department of Mathematical Sciences, Delaware State University, Dover, DE
Dragoljub Pokrajac, PhD , Department of Computer and Information Sciences, Delaware State University, Dover , DE
Aleskandar Lazarevic, PhD , United Technologies Research Center, Hartford, CT
Natasa Reljin , Department of Mathematical Sciences, Delaware State University, Dover, DE
Noureddine Melikechi, PhD , College of Mathematics, Natural Sciences, and Technology, Delaware State University, Dover, DE
Aristides Marcano, PhD , Department of Physics and Pre-engineering, Delaware State University, Dover, DE
Yuri Markushin, PhD , Department of Physics and Pre-engineering, Delaware State University, Dover, DE
In recent years, the spectroscopy community has increasingly been using various techniques for automatic computer assisted quantitative and qualitative evaluation of specimen based on spectroscopy data. Within a large variety of modern spectroscopy techniques, laser induced breakdown spectroscopy (LIBS) surges as a fast, versatile and powerful analytical technique with the ability to make remote measurements in field environments.

We perform multi-class classification of LIBS spectroscopy data of four proteins: Bovine Serum Albumin (the most abundant protein in blood plasma), Osteopontin, Leptin and Insulin-like Growth Factor II (potential biomarkers for ovarian cancer). Principal Component Analysis (PCA) is applied on the data as a feature extraction technique to select features that both are easy to compute and preserve useful discriminatory information for the classification algorithms. Classification of these proteins is performed using five classification techniques: K-nearest neighbor, classification and regression trees (CART), neural networks, support vector machines (SVMs), and adaptive local hyperplane (ALH). The aim is to show that this methodology can lead to separable classes of complex proteins in higher dimensional feature space which can result in automatic classification achieving high classification accuracy. Automatic classification of these complex proteins can lead to identification of elemental fingerprints of biological and chemical components that are vital in the detection of certain diseases (i.e. ovarian cancer). Our approach demonstrates that highly accurate automatic classification of complex protein samples is possible on laser-induced breakdown spectroscopy (LIBS) data, using principal component analysis (PCA) with sufficiently large number of extracted features and appropriate machine learning classification techniques.