Open Access. Powered by Scholars. Published by Universities.®

Science and Technology Studies Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 375

Full-Text Articles in Science and Technology Studies

Gender-Based Violence In 140 Characters Or Fewer: A #Bigdata Case Study Of Twitter, Hemant Purohit, Tanvi Banerjee, Andrew Hampton, Valerie L. Shalin, Nayanesh Bhandutia, Amit P. Sheth Jul 2015

Gender-Based Violence In 140 Characters Or Fewer: A #Bigdata Case Study Of Twitter, Hemant Purohit, Tanvi Banerjee, Andrew Hampton, Valerie L. Shalin, Nayanesh Bhandutia, Amit P. Sheth

Amit P. Sheth

Public institutions are increasingly reliant on data from social media sites to measure public attitude and provide timely public engagement. Such reliance includes the exploration of public views on important social issues such as gender-based violence (GBV). In this study, we examine big (social) data consisting of nearly fourteen million tweets collected from Twitter over a period of ten months to analyze public opinion regarding GBV, highlighting the nature of tweeting practices by geographical location and gender. We demonstrate the utility of Computational Social Science to mine insight from the corpus while accounting for the influence of both transient events ...


Entity Recommendations Using Hierarchical Knowledge Bases, Siva Kumar Cheekula, Pavan Kapanipathi, Derek Doran, Prateek Jain, Amit P. Sheth Jul 2015

Entity Recommendations Using Hierarchical Knowledge Bases, Siva Kumar Cheekula, Pavan Kapanipathi, Derek Doran, Prateek Jain, Amit P. Sheth

Amit P. Sheth

Recent developments in recommendation algorithms have focused on integrating Linked Open Data to augment traditional algorithms with background knowledge. These developments recognize that the integration of Linked Open Data may or better performance, particularly in cold start cases. In this paper, we explore if and how a specific type of Linked Open Data, namely hierarchical knowledge, may be utilized for recommendation systems. We propose a content-based recommendation approaches that adapts a spreading activation algorithm over the DBpedia category structure to identify entities of interest to the user. Evaluation of the algorithm over the Movielens dataset demonstrates that our method yields ...


"Time For Dabs": Analyzing Twitter Data On Butane Hash Oil Use, Raminta Daniulaityte, Robert G. Carlson, Farahnaz Golroo, Sanjaya Wijeratne, Edward W. Boyer, Silvia S. Martins, Ramzi W. Nahhas, Amit P. Sheth Jul 2015

"Time For Dabs": Analyzing Twitter Data On Butane Hash Oil Use, Raminta Daniulaityte, Robert G. Carlson, Farahnaz Golroo, Sanjaya Wijeratne, Edward W. Boyer, Silvia S. Martins, Ramzi W. Nahhas, Amit P. Sheth

Amit P. Sheth

No abstract provided.


Analyzing The Social Media Footprint Of Street Gangs, Sanjaya Wijeratne, Derek Doran, Amit P. Sheth, Jack Dustin Jul 2015

Analyzing The Social Media Footprint Of Street Gangs, Sanjaya Wijeratne, Derek Doran, Amit P. Sheth, Jack Dustin

Amit P. Sheth

Gangs utilize social media as a way to maintain threatening virtual presences, to communicate about their activities, and to intimidate others. Such usage has gained the attention of many justice service agencies that wish to create better crime prevention and judicial services. However, these agencies use analysis methods that are labor intensive and only lead to basic, qualitative data interpretations. This paper presents the architecture of a modern platform to discover the structure, function, and operation of gangs through the lens of social media. Preliminary analysis of social media posts shared in the greater Chicago, IL region demonstrate the platform ...


Smart Data - How You And I Will Exploit Big Data For Personalized Digital Health And Many Other Activities, Amit P. Sheth Jul 2015

Smart Data - How You And I Will Exploit Big Data For Personalized Digital Health And Many Other Activities, Amit P. Sheth

Amit P. Sheth

No abstract provided.


Big Data And Smart Cities, Amit P. Sheth Jul 2015

Big Data And Smart Cities, Amit P. Sheth

Amit P. Sheth

No abstract provided.


Knowledge Enabled Approach To Predict The Location Of Twitter Users, Revathy Krishnamurthy, Pavan Kapanipathi, Amit P. Sheth, Krishnaprasad Thirunarayan Jul 2015

Knowledge Enabled Approach To Predict The Location Of Twitter Users, Revathy Krishnamurthy, Pavan Kapanipathi, Amit P. Sheth, Krishnaprasad Thirunarayan

Amit P. Sheth

Knowledge bases have been used to improve performance in applications ranging from web search and event detection to entity recognition and disambiguation. More recently, knowledge bases have been used to analyze social data. A key challenge in social data analysis has been the identification of the geographic location of online users in a social network such as Twitter. Existing approaches to predict the location of users, based on their tweets, rely solely on social media features or probabilistic language models. These approaches are supervised and require large training dataset of geo-tagged tweets to build their models. As most Twitter users ...


Knowledge-Driven Personalized Contextual Mhealth Service For Asthma Management In Children, Pramod Anantharam, Tanvi Banerjee, Amit P. Sheth, Krishnaprasad Thirunarayan, Surendra Marupudi, Vaikunth Sridharan Jul 2015

Knowledge-Driven Personalized Contextual Mhealth Service For Asthma Management In Children, Pramod Anantharam, Tanvi Banerjee, Amit P. Sheth, Krishnaprasad Thirunarayan, Surendra Marupudi, Vaikunth Sridharan

Amit P. Sheth

Wide adoption of smartphones and availability of low-cost sensors has resulted in seamless and continuous monitoring of physiology, environment, and public health notifications. However, personalized digital health and patient empowerment can become a reality only if the complex multisensory and multimodal data is processed within the patient context. Contextual processing of patient data along with personalized medical knowledge can lead to actionable information for better and timely decisions. We present a system called kHealth capable of aggregating multisensory and multimodal data from sensors (passive sensing) and answers to questionnaire (active sensing) from patients with asthma. We present our preliminary data ...


Semantic Gateway As A Service Architecture For Iot Interoperability, Pratikkumar Desai, Amit P. Sheth, Pramod Anantharam Jul 2015

Semantic Gateway As A Service Architecture For Iot Interoperability, Pratikkumar Desai, Amit P. Sheth, Pramod Anantharam

Amit P. Sheth

The Internet of Things (IoT) is set to occupy a substantial component of future Internet. The IoT connects sensors and devices that record physical observations to applications and services of the Internet. As a successor to technologies such as RFID and Wireless Sensor Networks (WSN), the IoT has stumbled into vertical silos of proprietary systems, providing little or no interoperability with similar systems. As the IoT represents future state of the Internet, an intelligent and scalable architecture is required to provide connectivity between these silos, enabling discovery of physical sensors and interpretation of messages between things. This paper proposes a ...


Triad-Based Role Discovery For Large Social Systems, Derek Doran Feb 2015

Triad-Based Role Discovery For Large Social Systems, Derek Doran

Derek Doran

The social role of a participant in a social system conceptualizes the circumstances under which she chooses to interact with others, making their discovery and analysis important for theoretical and practical purposes. In this paper, we propose a methodology to detect such roles by utilizing the conditional triad censuses of ego-networks. These censuses are a promising tool for social role extraction because they capture the degree to which basic social forces push upon a user to interact with others in a system. Clusters of triad censuses, inferred from network samples that preserve local structural properties, define the social roles. The ...


Data Analytics For Power Utility Storm Planning, Lan Lin, Aldo Dagnino, Derek Doran, Swapna S. Gokhale Feb 2015

Data Analytics For Power Utility Storm Planning, Lan Lin, Aldo Dagnino, Derek Doran, Swapna S. Gokhale

Derek Doran

As the world population grows, recent climatic changes seem to bring powerful storms to populated areas. The impact of these storms on utility services is devastating. Hurricane Sandy is a recent example of the enormous damages that storms can inflict on infrastructure, society, and the economy. Quick response to these emergencies represents a big challenge to electric power utilities. Traditionally utilities develop preparedness plans for storm emergency situations based on the experience of utility experts and with limited use of historical data. With the advent of the Smart Grid, utilities are incorporating automation and sensing technologies in their grids and ...


Protecting Web Servers From Web Robot Traffic, Derek Doran Feb 2015

Protecting Web Servers From Web Robot Traffic, Derek Doran

Derek Doran

No abstract provided.


Some Trust Issues In Social Networks And Sensor Networks, Krishnaprasad Thirunarayan, Pramod Anantharam, Cory Andrew Henson, Amit P. Sheth Dec 2014

Some Trust Issues In Social Networks And Sensor Networks, Krishnaprasad Thirunarayan, Pramod Anantharam, Cory Andrew Henson, Amit P. Sheth

Amit P. Sheth

Trust and reputation are becoming increasingly important in diverse areas such as search, e-commerce, social media, semantic sensor networks, etc. We review past work and explore future research issues relevant to trust in social/sensor networks and interactions. We advocate a balanced, iterative approach to trust that marries both theory and practice. On the theoretical side, we investigate models of trust to analyze and specify the nature of trust and trust computation. On the practical side, we propose to uncover aspects that provide a basis for trust formation and techniques to extract trust information from concrete social/sensor networks and ...


Iexplore: Interactive Browsing And Exploring Biomedical Knowledge, Vinh Nguyen, Olivier Bodenreider, Jagannathan Srinivasan, Todd Minning, Thomas Rindflesch, Bastien Rance, Ramakanth Kavuluru, Himi Yalamanchili, Krishnaprasad Thirunarayan, Satya S. Sahoo, Amit P. Sheth Dec 2014

Iexplore: Interactive Browsing And Exploring Biomedical Knowledge, Vinh Nguyen, Olivier Bodenreider, Jagannathan Srinivasan, Todd Minning, Thomas Rindflesch, Bastien Rance, Ramakanth Kavuluru, Himi Yalamanchili, Krishnaprasad Thirunarayan, Satya S. Sahoo, Amit P. Sheth

Amit P. Sheth

We present iExplore, a Semantic Web based application that helps biomedical researchers study and explore biomedical knowledge interactively. iExplore uses the Biomedical Knowledge Repository (BKR), which integrates knowledge from various sources ranging from information extracted from biomedical literature (from PubMed) to many structured vocabularies in the Unified Medical Language System (UMLS). The current version of BKR provides a unified provenance representation for 12 million semantic predications (triples with a predicate connecting a subject and an object) derived from 87 vocabulary families in the UMLS and 14 million predications extracted from 21 million PubMed abstracts. To engage the domain experts in ...


Semantics And Services Enabled Problem Solving Environment For Trypanosoma Cruzi, Amit P. Sheth, Rick L. Tarleton, Mark Musen, Satya S. Sahoo, Prashant Doshi, Natasha Noy Dec 2014

Semantics And Services Enabled Problem Solving Environment For Trypanosoma Cruzi, Amit P. Sheth, Rick L. Tarleton, Mark Musen, Satya S. Sahoo, Prashant Doshi, Natasha Noy

Amit P. Sheth

No abstract provided.


Faces: Diversity-Aware Entity Summarization Using Incremental Hierarchical Conceptual Clustering, Kalpa Gunaratna, Krishnaprasad Thirunarayan, Amit P. Sheth Dec 2014

Faces: Diversity-Aware Entity Summarization Using Incremental Hierarchical Conceptual Clustering, Kalpa Gunaratna, Krishnaprasad Thirunarayan, Amit P. Sheth

Amit P. Sheth

Semantic Web documents that encode facts about entities on the Web have been growing rapidly in size and evolving over time. Creating summaries on lengthy Semantic Web documents for quick identification of the corresponding entity has been of great contemporary interest. In this paper, we explore automatic summarization techniques that characterize and enable identification of an entity and create summaries that are human friendly. Specifically, we highlight the importance of diversified (faceted) summaries by combining three dimensions: diversity, uniqueness, and popularity. Our novel diversity-aware entity summarization approach mimics human conceptual clustering techniques to group facts, and picks representative facts from ...


Extracting City Traffic Events From Social Streams, Pramod Anantharam, Payam Barnaghi, Krishnaprasad Thirunarayan, Amit P. Sheth Dec 2014

Extracting City Traffic Events From Social Streams, Pramod Anantharam, Payam Barnaghi, Krishnaprasad Thirunarayan, Amit P. Sheth

Amit P. Sheth

Cities are composed of complex systems with physical, cyber, and social components. Current works on extracting and understanding city events mainly rely on technology enabled infrastructure to observe and record events. In this work, we propose an approach to leverage citizen observations of various city systems and services such as traffic, public transport, water supply, weather, sewage, and public safety as a source of city events. We investigate the feasibility of using such textual streams for extracting city events from annotated text. We formalize the problem of annotating social streams such as microblogs as a sequence labeling problem. We present ...


Semantic (Web) Technology In Action: Ontology Driven Information Systems For Search, Integration, And Analysis, Amit P. Sheth, Cartic Ramakrishnan Dec 2014

Semantic (Web) Technology In Action: Ontology Driven Information Systems For Search, Integration, And Analysis, Amit P. Sheth, Cartic Ramakrishnan

Amit P. Sheth

Semantics is seen as the key ingredient in the next phase of the Web infrastructure as well as the next generation of information systems applications. In this context, we review some of the reservations expressed about the viability of the Semantic Web. We respond to these by identifying a Semantic Technology that supports the key capabilities also needed to realize the Semantic Web vision, namely representing, acquiring and utilizing knowledge. Given that scalability is a key challenge, we briefly review our observations from developing three classes of real world applications and corresponding technology components: search/browsing, integration, and analytics. We ...


Automatic Domain Model Creation Using Pattern-Based Fact Extraction, Christopher Thomas, Pankaj Mehra, Wenbo Wang, Amit P. Sheth, Gerhard Weikum, Victor Chan Dec 2014

Automatic Domain Model Creation Using Pattern-Based Fact Extraction, Christopher Thomas, Pankaj Mehra, Wenbo Wang, Amit P. Sheth, Gerhard Weikum, Victor Chan

Amit P. Sheth

This paper describes a minimally guided approach to automatic domain model creation. The first step is to carve an area of interest out of the Wikipedia hierarchy based on a simple query or other starting point. The second step is to connect the concepts in this domain hierarchy with named relationships. A starting point is provided by Linked Open Data, such as DBPedia. Based on these community-generated facts we train a pattern-based fact-extraction algorithm to augment a domain hierarchy with previously unknown relationship occurrences. Pattern vectors are learned that represent occurrences of relationships between concepts. The process described can be ...


Location Prediction Of Twitter Users Using Wikipedia, Revathy Krishnamurthy, Pavan Kapanipathi, Amit P. Sheth, Krishnaprasad Thirunarayan Dec 2014

Location Prediction Of Twitter Users Using Wikipedia, Revathy Krishnamurthy, Pavan Kapanipathi, Amit P. Sheth, Krishnaprasad Thirunarayan

Amit P. Sheth

The mining of user generated content in social media has proven very effective in domains ranging from personalization and recommendation systems to crisis management. The knowledge of online users locations makes their tweets more informative and adds another dimension to their analysis. Existing approaches to predict the location of Twitter users are purely data-driven and require large training data sets of geo-tagged tweets. The collection and modelling process of tweets can be time intensive. To overcome this drawback, we propose a novel knowledge based approach that does not require any training data. Our approach uses information in Wikipedia, about cities ...


Ontology Supported Knowledge Discovery In The Field Of Human Performance And Cognition, Christopher Thomas, Pablo N. Mendes, Delroy H. Cameron, Amit P. Sheth, Krishnaprasad Thirunarayan, Cartic Ramakrishnan Dec 2014

Ontology Supported Knowledge Discovery In The Field Of Human Performance And Cognition, Christopher Thomas, Pablo N. Mendes, Delroy H. Cameron, Amit P. Sheth, Krishnaprasad Thirunarayan, Cartic Ramakrishnan

Amit P. Sheth

No abstract provided.


Semantics-Empowered Big Data Processing With Applications, Krishnaprasad Thirunarayan, Amit P. Sheth Dec 2014

Semantics-Empowered Big Data Processing With Applications, Krishnaprasad Thirunarayan, Amit P. Sheth

Amit P. Sheth

We discuss the nature of Big Data and address the role of semantics in analyzing and processing Big Data that arises in the context of Physical-Cyber-Social Systems. We organize our research around the Five Vs of Big Data, where four of the Vs are harnessed to produce the fifth V - value. To handle the challenge of Volume, we advocate semantic perception that can convert low-level observational data to higher-level abstractions more suitable for decision-making. To handle the challenge of Variety, we resort to the use of semantic models and annotations of data so that much of the intelligent processing can ...


Automated Isolation Of Translational Efficiency Bias That Resists The Confounding Effect Of Gc(At)-Content, Douglas W. Raiford, Dan E. Krane, Travis E. Doom, Michael L. Raymer Oct 2014

Automated Isolation Of Translational Efficiency Bias That Resists The Confounding Effect Of Gc(At)-Content, Douglas W. Raiford, Dan E. Krane, Travis E. Doom, Michael L. Raymer

Michael L. Raymer

Genomic sequencing projects are an abundant source of information for biological studies ranging from the molecular to the ecological in scale; however, much of the information present may yet be hidden from casual analysis. One such information domain, trends in codon usage, can provide a wealth of information about an organism's genes and their expression. Degeneracy in the genetic code allows more than one triplet codon to code for the same amino acid, and usage of these codons is often biased such that one or more of these synonymous codons is preferred. Detection of this bias is an important ...


Towards Attack-Resilient Geometric Data Perturbation, Keke Chen, Ling Liu Oct 2014

Towards Attack-Resilient Geometric Data Perturbation, Keke Chen, Ling Liu

Keke Chen

Data perturbation is a popular technique for privacy-preserving data mining. The major challenge of data perturbation is balancing privacy protection and data quality, which are normally considered as a pair of contradictive factors. We propose that selectively preserving only the task/model specific information in perturbation would improve the balance. Geometric data perturbation, consisting of random rotation perturbation, random translation perturbation, and noise addition, aims at preserving the important geometric properties of a multidimensional dataset, while providing better privacy guarantee for data classification modeling. The preliminary study has shown that random geometric perturbation can well preserve model accuracy for several ...


Cloudvista: Visual Cluster Exploration For Extreme Scale Data In The Could, Keke Chen, Huiqi Xi, Fengguang Tian, Shumin Guo Oct 2014

Cloudvista: Visual Cluster Exploration For Extreme Scale Data In The Could, Keke Chen, Huiqi Xi, Fengguang Tian, Shumin Guo

Keke Chen

The problem of efficient and high-quality clustering of extreme scale datasets with complex clustering structures continues to be one of the most challenging data analysis problems. An innovate use of data cloud would provide unique opportunity to address this challenge. In this paper, we propose the CloudVista framework to address (1) the problems caused by using sampling in the existing approaches and (2) the problems with the latency caused by cloud-side processing on interactive cluster visualization. The CloudVista framework aims to explore the entire large data stored in the cloud with the help of the data structure visual frame and ...


Detecting The Change Of Clustering Structure In Categorical Data Streams, Keke Chen, Ling Liu Oct 2014

Detecting The Change Of Clustering Structure In Categorical Data Streams, Keke Chen, Ling Liu

Keke Chen

Analyzing clustering structures in data streams can provide critical information for making decision in real time. In this paper, we present a framework for detecting the change of critical clustering structure in categorical data streams. The framework consists of the Hierarchical Entropy Tree structure (HE-Tree) and the extended ACE clustering algorithm. HE-Tree can efficiently capture the entropy property of the categorical data streams and allow us to draw precise clustering information from the data stream for high-quality BkPLots with the extended ACE algorithm.


Privacy-Preserving Data Classification With Rotation Perturbation, Keke Chen, Ling Liu Oct 2014

Privacy-Preserving Data Classification With Rotation Perturbation, Keke Chen, Ling Liu

Keke Chen

This paper presents a random rotation perturbation approach for privacy preserving data classification. Concretely, we identify the importance of classification-specific information with respect to the loss of information factor, and present a random rotation perturbation framework for privacy preserving data classification. Our approach has two unique characteristics. First, we identify that many classification models utilize the geometric properties of datasets, which can be preserved by geometric rotation. We prove that the three types of classifiers will deliver the same performance over the rotation perturbed dataset as over the original dataset. Second, we propose a multi-column privacy model to address the ...


A General Boosting Method And Its Application To Learning Ranking Functions For Web Search, Zhaohui Zheng, Hongyuan Zha, Tong Zhang, Olivier Chapelle, Keke Chen, Gordon Sun Oct 2014

A General Boosting Method And Its Application To Learning Ranking Functions For Web Search, Zhaohui Zheng, Hongyuan Zha, Tong Zhang, Olivier Chapelle, Keke Chen, Gordon Sun

Keke Chen

We present a general boosting method extending functional gradient boosting to optimize complex loss functions that are encountered in many machine learning problems. Our approach is based on optimization of quadratic upper bounds of the loss functions which allows us to present a rigorous convergence analysis of the algorithm. More importantly, this general framework enables us to use a standard regression base learner such as decision trees for fitting any loss function. We illustrate an application of the proposed method in learning ranking functions for Web search by combining both preference data and labeled data for training. We present experimental ...


Adapting Ranking Functions To User Preference, Keke Chen, Ya Zhang, Zhaohui Zheng, Hongyuan Zha, Gordon Sun Oct 2014

Adapting Ranking Functions To User Preference, Keke Chen, Ya Zhang, Zhaohui Zheng, Hongyuan Zha, Gordon Sun

Keke Chen

Learning to rank has become a popular method for web search ranking. Traditionally, expert-judged examples are the major training resource for machine learned web ranking, which is expensive to get for training a satisfactory ranking function. The demands for generating specific web search ranking functions tailored for different domains, such as ranking functions for different regions, have aggravated this problem. Recently, a few methods have been proposed to extract training examples from user clickthrough log. Due to the low cost of getting user preference data, it is attractive to combine these examples in training ranking functions. However, because of the ...


Scale: A Scalable Framework For Efficiently Clustering Transactional Data, Hua Yan, Keke Chen, Ling Liu, Zhang Yi Oct 2014

Scale: A Scalable Framework For Efficiently Clustering Transactional Data, Hua Yan, Keke Chen, Ling Liu, Zhang Yi

Keke Chen

This paper presents SCALE, a fully automated transactional clustering framework. The SCALE design highlights three unique features. First, we introduce the concept of Weighted Coverage Density as a categorical similarity measure for efficient clustering of transactional datasets. The concept of weighted coverage density is intuitive and it allows the weight of each item in a cluster to be changed dynamically according to the occurrences of items. Second, we develop the weighted coverage density measure based clustering algorithm, a fast, memory-efficient, and scalable clustering algorithm for analyzing transactional data. Third, we introduce two clustering validation metrics and show that these domain ...