Seminar: Extracting topics from books: distinguishing books between/within genres
Supervisor: Lourdes Peña-Castillo
Extracting topics from books: distinguishing books between/within genres
Department of Computer Science
Thursday, November 29, 2018, 9:30a.m., Room EN-2022
Text analysis involves computational analysis of unstructured documents to extract relevant information. Topic Modeling is a text analysis technique used for extracting latent themes or topics in such documents. Analysis of large texts, such as books, can significantly benefit from the extraction of broad themes. We aim to use topic modeling to analyze classic books belonging to different genres including classic literature and philosophy. We use LDA or Latent Dirichlet Allocation, an unsupervised classifier for implementing topic modeling. We assess the performance of the model based on the number of topics to be found.