Open Access. Powered by Scholars. Published by Universities.®
Databases and Information Systems Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Publication
- Publication Type
Articles 1 - 5 of 5
Full-Text Articles in Databases and Information Systems
Word Embedding Driven Concept Detection In Philosophical Corpora, Dylan Hayton-Ruffner
Word Embedding Driven Concept Detection In Philosophical Corpora, Dylan Hayton-Ruffner
Honors Projects
During the course of research, scholars often explore large textual databases for segments of text relevant to their conceptual analyses. This study proposes, develops and evaluates two algorithms for automated concept detection in theoretical corpora: ACS and WMD retrieval. Both novel algorithms are compared to key word retrieval, using a test set from the Digital Ricoeur corpus tagged by scholarly experts. WMD retrieval outperforms key word search on the concept detection task. Thus, WMD retrieval is a promising tool for concept detection and information retrieval systems focused on theoretical corpora.
User Interface Design, Moritz Stefaner, Sebastien Ferre, Saverio Perugini, Jonathan Koren, Yi Zhang
User Interface Design, Moritz Stefaner, Sebastien Ferre, Saverio Perugini, Jonathan Koren, Yi Zhang
Saverio Perugini
As detailed in Chap. 1, system implementations for dynamic taxonomies and faceted search allow a wide range of query possibilities on the data. Only when these are made accessible by appropriate user interfaces, the resulting applications can support a variety of search, browsing and analysis tasks. User interface design in this area is confronted with specific challenges. This chapter presents an overview of both established and novel principles and solutions.
Automatically Discovering The Number Of Clusters In Web Page Datasets, Zhongmei Yao
Automatically Discovering The Number Of Clusters In Web Page Datasets, Zhongmei Yao
Zhongmei Yao
Clustering is well-suited for Web mining by automatically organizing Web pages into categories, each of which contains Web pages having similar contents. However, one problem in clustering is the lack of general methods to automatically determine the number of categories or clusters. For the Web domain in particular, currently there is no such method suitable for Web page clustering. In an attempt to address this problem, we discover a constant factor that characterizes the Web domain, based on which we propose a new method for automatically determining the number of clusters in Web page data sets. We discover that the …
User Interface Design, Moritz Stefaner, Sebastien Ferre, Saverio Perugini, Jonathan Koren, Yi Zhang
User Interface Design, Moritz Stefaner, Sebastien Ferre, Saverio Perugini, Jonathan Koren, Yi Zhang
Computer Science Faculty Publications
As detailed in Chap. 1, system implementations for dynamic taxonomies and faceted search allow a wide range of query possibilities on the data. Only when these are made accessible by appropriate user interfaces, the resulting applications can support a variety of search, browsing and analysis tasks. User interface design in this area is confronted with specific challenges. This chapter presents an overview of both established and novel principles and solutions.
Automatically Discovering The Number Of Clusters In Web Page Datasets, Zhongmei Yao
Automatically Discovering The Number Of Clusters In Web Page Datasets, Zhongmei Yao
Computer Science Faculty Publications
Clustering is well-suited for Web mining by automatically organizing Web pages into categories, each of which contains Web pages having similar contents. However, one problem in clustering is the lack of general methods to automatically determine the number of categories or clusters. For the Web domain in particular, currently there is no such method suitable for Web page clustering. In an attempt to address this problem, we discover a constant factor that characterizes the Web domain, based on which we propose a new method for automatically determining the number of clusters in Web page data sets. We discover that the …