Open Access. Powered by Scholars. Published by Universities.®

Social and Behavioral Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

2014

Data mining

Discipline
Institution
Publication
Publication Type
File Type

Articles 1 - 12 of 12

Full-Text Articles in Social and Behavioral Sciences

Twitter Location (Sometimes) Matters: Exploring The Relationship Between Georeferenced Tweet Content And Nearby Feature Classes, Stefan Hahmann, Ross S. Purves, Dirk Burghardt Dec 2014

Twitter Location (Sometimes) Matters: Exploring The Relationship Between Georeferenced Tweet Content And Nearby Feature Classes, Stefan Hahmann, Ross S. Purves, Dirk Burghardt

Journal of Spatial Information Science

In this paper, we investigate whether microblogging texts (tweets) produced on mobile devices are related to the geographical locations where they were posted. For this purpose, we correlate tweet topics to areas. In doing so, classified points of interest from OpenStreetMap serve as validation points. We adopted the classification and geolocation of these points to correlate with tweet content by means of manual, supervised, and unsupervised machine learning approaches. Evaluation showed the manual classification approach to be highest quality, followed by the supervised method, and that the unsupervised classification was of low quality. We found that the degree to which …


Leveraging Bibliographic Rdf Data For Keyword Prediction With Association Rule Mining (Arm), Nidhi Kushwaha, O P. Vyas Nov 2014

Leveraging Bibliographic Rdf Data For Keyword Prediction With Association Rule Mining (Arm), Nidhi Kushwaha, O P. Vyas

Copyright, Fair Use, Scholarly Communication, etc.

The Semantic Web ( Web 3.03.0) has been proposed as an efficient way to access the increasingly large amounts of data on the internet. The Linked Open Data Cloud project at present is the major effort to implement the concepts of the Seamtic Web, addressing the problems of in homogeneity and large data volumes. RKBExplorer is one of many repositories implementing Open Data and contains considerable bibliographic information. Th is paper discusses bibliographic data data, an important part of cloud data. Effective searching of bibliographic datasets can be a challenge as many of the papers residing in these databases do …


Usage Of E-Resources: Virtual Value Of Demographics, Sue Samson Oct 2014

Usage Of E-Resources: Virtual Value Of Demographics, Sue Samson

Mansfield Library Faculty Publications

The focus of this study was to identify: 1) usage of library e-resources by faculty and staff affiliation and status to identify research and teaching needs; 2) usage of library e-resources by student major, status, gender, registered disability and registered veteran to establish best outreach practices and areas that need service improvement and collection development in support of student learning; and 3) the correlation between use of library e-resources and student attainment as defined by grade point average (GPA). Demographic data was collected for these users based on their university NetID logins. The findings in this study conclusively document that …


Using "Big Data" For Transportation Analysis: A Case Study Of The La Metro Expo Line, Mohja L. Rhoads Oct 2014

Using "Big Data" For Transportation Analysis: A Case Study Of The La Metro Expo Line, Mohja L. Rhoads

PSU Transportation Seminars

Access to a comprehensive historical archive of real-time, multi-modal multi-agency transportation system data has provided a unique opportunity to demonstrate how “big data” can be used for policy analysis, and to offer new insights for planning scholarship and practice. We illustrate with a case study of a new rail transit line. We use transit, freeway, and arterial data of high spatial and temporal resolution to examine transportation system performance impacts of the Exposition (Expo) light rail line (Phase 1) in Los Angeles. Using a quasi-experimental research design, we explore whether the Expo Line has had a significant impact on transit …


Time-Series Data Mining In Transportation: A Case Study On Singapore Public Train Commuter Travel Patterns, Roy Ka Wei Lee, Tin Seong Kam Oct 2014

Time-Series Data Mining In Transportation: A Case Study On Singapore Public Train Commuter Travel Patterns, Roy Ka Wei Lee, Tin Seong Kam

Research Collection School Of Computing and Information Systems

The adoption of smart cards technologies and automated data collection systems (ADCS) in transportation domain had provided public transport planners opportunities to amass a huge and continuously increasing amount of time-series data about the behaviors and travel patterns of commuters. However the explosive growth of temporal related databases has far outpaced the transport planners’ ability to interpret these data using conventional statistical techniques, creating an urgent need for new techniques to support the analyst in transforming the data into actionable information and knowledge. This research study thus explores and discusses the potential use of time-series data mining, a relatively new …


A Comparative Study: Utilizing Data Mining Techniques To Classify Traffic Congestion Status, Abbas Mirakhorli Aug 2014

A Comparative Study: Utilizing Data Mining Techniques To Classify Traffic Congestion Status, Abbas Mirakhorli

UNLV Theses, Dissertations, Professional Papers, and Capstones

Performance measure is a process of evaluating and quantifying a system. Performance measure provides us with information about how good a system is working and how well the predefined goals are met. In order to analyze the performance of a transportation system, the traffic data such as speed, volume, occupancy and travel time of the system need to be collected. These data will generate valuable historical database that can be used to develop models to improve the quality of service of transportation system. The performance measures in transportation studies can be categorized to following main groups: Congestion, Mobility, Accessibility, Reliability, …


Automated Library Recommendation, Ferdian Thung, David Lo, Julia Lawall Jun 2014

Automated Library Recommendation, Ferdian Thung, David Lo, Julia Lawall

David LO

Many third party libraries are available to be downloaded and used. Using such libraries can reduce development time and make the developed software more reliable. However, developers are often unaware of suitable libraries to be used for their projects and thus they miss out on these benefits. To help developers better take advantage of the available libraries, we propose a new technique that automatically recommends libraries to developers. Our technique takes as input the set of libraries that an application currently uses, and recommends other libraries that are likely to be relevant. We follow a hybrid approach that combines association …


On Predicting User Affiliations Using Social Features In Online Social Networks, Minh Thap Nguyen Mar 2014

On Predicting User Affiliations Using Social Features In Online Social Networks, Minh Thap Nguyen

Dissertations and Theses Collection (Open Access)

User profiling such as user affiliation prediction in online social network is a challenging task, with many important applications in targeted marketing and personalized recommendation. The research task here is to predict some user affiliation attributes that suggest user participation in different social groups.


Brief Of Digital Humanities And Law Scholars As Amici Curiae In Support Of Defendant-Appellees And Affirmance, (The Authors Guild, Inc., Et Al., V. Google, Inc., Et Al.), Matthew L. Jockers, Matthew Sag, Jason Schultz Jan 2014

Brief Of Digital Humanities And Law Scholars As Amici Curiae In Support Of Defendant-Appellees And Affirmance, (The Authors Guild, Inc., Et Al., V. Google, Inc., Et Al.), Matthew L. Jockers, Matthew Sag, Jason Schultz

Copyright, Fair Use, Scholarly Communication, etc.

Amici are over 150 professors and scholars who teach, write, and research in computer science, the digital humanities, linguistics or law, and two associations that represent Digital Humanities scholars generally.2 Amici have an interest in this case because of its potential impact on their ability to discover and understand, through automated means, the data in and relationships among textual works. Legal Scholar Amici also have an interest in the sound development of intellectual property law. Resolution of the legal issue of copying for non-expressive uses has far-reaching implications for the scope of copyright protection, a subject germane to Amici’s professional …


Hot Zone Identification: Analyzing Effects Of Data Sampling On Spam Clustering, Rasib Khan, Mainul Mizan, Ragib Hasan, Alan Sprague Jan 2014

Hot Zone Identification: Analyzing Effects Of Data Sampling On Spam Clustering, Rasib Khan, Mainul Mizan, Ragib Hasan, Alan Sprague

Journal of Digital Forensics, Security and Law

Email is the most common and comparatively the most efficient means of exchanging information in today's world. However, given the widespread use of emails in all sectors, they have been the target of spammers since the beginning. Filtering spam emails has now led to critical actions such as forensic activities based on mining spam email. The data mine for spam emails at the University of Alabama at Birmingham is considered to be one of the most prominent resources for mining and identifying spam sources. It is a widely researched repository used by researchers from different global organizations. The usual process …


Guiding Data-Driven Transportation Decisions, Kristin A. Tufte, Basem Elazzabi, Nathan Hall, Morgan Harvey, Kath Knobe, David Maier, Veronika Margaret Megler Jan 2014

Guiding Data-Driven Transportation Decisions, Kristin A. Tufte, Basem Elazzabi, Nathan Hall, Morgan Harvey, Kath Knobe, David Maier, Veronika Margaret Megler

Computer Science Faculty Publications and Presentations

Urban transportation professionals are under increasing pressure to perform data-driven decision making and to provide data-driven performance metrics. This pressure comes from sources including the federal government and is driven, in part, by the increased volume and variety of transportation data available. This sudden increase of data is partially a result of improved technology for sensors and mobile devices as well as reduced device and storage costs. However, using this proliferation of data for decisions and performance metrics is proving to be difficult. In this paper, we describe a proposed structure for a system to support data-driven decision making. A …


Educational Data Mining: An Advance For Intelligent Systems In Education, Ryan Baker Dec 2013

Educational Data Mining: An Advance For Intelligent Systems In Education, Ryan Baker

Ryan S.J.d. Baker

Computer-based technologies have transformed the way we live, work, socialize, play, and learn. Today, the use of data collected through these technologies is supporting a second-round of transformation in all of these areas. Over the last decades, the methods of data mining and analytics have transformed field after field. Scientific fields such as physics, biology, and climate science have leveraged these methods to manage and make discoveries in previously unimaginably large datasets. The first journal devoted to data mining and analytics methods in biology, Computers in Biology and Medicine, began publication as long ago as the 1970s. In the mid-1990s …