Seminar: Labeling Large Scale Social Media Data using Budget-driven One-class SVM classification

Hao Yuan
M.Sc. Candidate|
Co-supervisors: Dr. Jian Tang & Dr. Minglun Gong

Labeling Large Scale Social Media Data using Budget-driven One-class SVM classification

Department of Computer Science
Tuesday, June 30, 2015, 1:00 pm, Room EN 2022


Abstract

The social media classification problems draw more and more attention in the past few years. With the rapid development of Internet and the popularity of computers, there is astronomical amount of information in the social network (social media platforms). These data are generally large scale and are often corrupted by noise. The presence of noise in training set has strong impact on the performance of supervised learning (classification) techniques. A budget-driven One-class SVM approach is presented in my thesis that is suitable for large scale social media data classification.

Our approach is based on an existing online One-class SVM learning algorithm, referred as the STOCS (Self-Tuning One-Class SVM) algorithm. To justify our choice, we first analyze the noise-resilient ability of the STOCS using synthetic data. Next, to handle big data classification problem for social media data, we introduce several budget driven features, which allow the algorithm to be trained within limited time and under limited memory requirement. Compared with two state-of-the-art approaches, Lib-Linear and kNN, our approach is shown to be competitive with lower requirements of memory and time.

 

Contact

Department of Computer Science

230 Elizabeth Ave

St. John's, NL A1B 3X9 CANADA

Tel: (709) 864-2530

Fax: (709) 864-2552

becomestudent@mun.ca