Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

2015

Institution
Keyword
Publication
Publication Type

Articles 1 - 30 of 165

Full-Text Articles in Entire DC Network

Invariant Spatial Information In Sketch Maps — A Study Of Survey Sketch Maps Of Urban Areas, Jia Wang, Angela Schwering Dec 2015

Invariant Spatial Information In Sketch Maps — A Study Of Survey Sketch Maps Of Urban Areas, Jia Wang, Angela Schwering

Journal of Spatial Information Science

It is commonly recognized that free-hand sketch maps are influenced by cognitive impacts and therefore sketch maps are incomplete, distorted, and schematized. This makes it difficult to achieve a one-to-one alignment between a sketch map and its corresponding geo-referenced metric map. Nevertheless, sketch maps are still useful to communicate spatial knowledge, indicating that sketch maps contain certain spatial information that is robust to cognitive impacts. In existing studies, sketch maps are used frequently to measure cognitive maps. However, little work has been done on invariant spatial information in sketch maps, which is the information of spatial configurations representing correctly the …


Development And Evaluation Of A Geographic Information Retrieval System Using Fine Grained Toponyms, Damien Palacio, Curdin Derungs, Ross S. Purves Dec 2015

Development And Evaluation Of A Geographic Information Retrieval System Using Fine Grained Toponyms, Damien Palacio, Curdin Derungs, Ross S. Purves

Journal of Spatial Information Science

Geographic information retrieval (GIR) is concerned with returning information in response to an information need, typically expressed in terms of a thematic and spatial component linked by a spatial relationship. However, evaluation initiatives have often failed to show significant differences between simple text baselines and more complex spatially enabled GIR approaches. We explore the effectiveness of three systems (a text baseline, spatial query expansion, and a full GIR system utilizing both text and spatial indexes) at retrieving documents from a corpus describing mountaineering expeditions, centred around fine grained toponyms. To allow evaluation, we use user generated content (UGC) in the …


Describing Images Using A Multilayer Framework Based On Qualitative Spatial Models, Tao Wang, Hui Shi Dec 2015

Describing Images Using A Multilayer Framework Based On Qualitative Spatial Models, Tao Wang, Hui Shi

Baltic International Yearbook of Cognition, Logic and Communication

To date most research in image processing has been based on quantitative representations of image features using pixel values, however, humans often use abstract and semantic knowledge to describe and analyze images. To enhance cognitive adequacy and tractability, we here present a multilayer framework based on qualitative spatial models. The layout features of segmented images are defined by qualitative spatial models which we introduce, and represented as a set of qualitative spatial constraints. Assigned different semantic and context knowledge, the image segments and the qualitative spatial constraints are interpreted from different perspectives. Finally, the knowledge layer of the framework enables …


Social Learning Systems: The Design Of Evolutionary, Highly Scalable, Socially Curated Knowledge Systems, Nolan Hemmatazad Dec 2015

Social Learning Systems: The Design Of Evolutionary, Highly Scalable, Socially Curated Knowledge Systems, Nolan Hemmatazad

Student Work

In recent times, great strides have been made towards the advancement of automated reasoning and knowledge management applications, along with their associated methodologies. The introduction of the World Wide Web peaked academicians’ interest in harnessing the power of linked, online documents for the purpose of developing machine learning corpora, providing dynamical knowledge bases for question answering systems, fueling automated entity extraction applications, and performing graph analytic evaluations, such as uncovering the inherent structural semantics of linked pages. Even more recently, substantial attention in the wider computer science and information systems disciplines has been focused on the evolving study of social …


Intent Classification Of Short-Text On Social Media, Hemant Purohit, Guozhu Dong, Valerie L. Shalin, Krishnaprasad Thirunarayan, Amit P. Sheth Dec 2015

Intent Classification Of Short-Text On Social Media, Hemant Purohit, Guozhu Dong, Valerie L. Shalin, Krishnaprasad Thirunarayan, Amit P. Sheth

Kno.e.sis Publications

Social media platforms facilitate the emergence of citizen communities that discuss real-world events. Their content reflects a variety of intent ranging from social good (e.g., volunteering to help) to commercial interest (e.g., criticizing product features). Hence, mining intent from social data can aid in filtering social media to support organizations, such as an emergency management unit for resource planning. However, effective intent mining is inherently challenging due to ambiguity in interpretation, and sparsity of relevant behaviors in social data. In this paper, we address the problem of multiclass classification of intent with a use-case of social data generated during crisis …


A Machine-Aided Approach To Generating Grammar Rules From Japanese Source Text For Use In Hybrid And Rule-Based Machine Translation Systems, Sean Michael Jones Dec 2015

A Machine-Aided Approach To Generating Grammar Rules From Japanese Source Text For Use In Hybrid And Rule-Based Machine Translation Systems, Sean Michael Jones

Theses and Dissertations

Many automatic machine translation systems available today use a hybrid of pure statistical translation and rule-based grammatical translations. This is largely due to the shortcomings of each individual approach, requiring a large amount of time for linguistics experts to hand-code grammar rules for a rule-based system and requiring large amounts of source text to generate accurate statistical models. By automating a portion of the rule generation process, the creation of grammar rules could be made to be faster, more efficient and less costly. By doing statistical analysis on a bilingual corpus, common grammar rules can be inferred and exported to …


Group Decision Making Using Comparative Linguistic Expression Based On Hesitant Intuitionistic Fuzzy Sets, Ismat Beg, Tabasam Rashid Dec 2015

Group Decision Making Using Comparative Linguistic Expression Based On Hesitant Intuitionistic Fuzzy Sets, Ismat Beg, Tabasam Rashid

Applications and Applied Mathematics: An International Journal (AAM)

We introduce a method for aggregation of experts’ opinions given in the form of comparative linguistic expression. An algorithmic form of technique for order preference is proposed for group decision making. A simple example is given by using this method for the selection of the best alternative as well as ranking the alternatives from the best to the worst.


Capstone Projects Mining System For Insights And Recommendations, Melvrivk Aik Chun Goh, Swapna Gottipati, Venky Shankararaman Dec 2015

Capstone Projects Mining System For Insights And Recommendations, Melvrivk Aik Chun Goh, Swapna Gottipati, Venky Shankararaman

Research Collection School Of Computing and Information Systems

In this paper, we present a classification based system to discover knowledge and trends in higher education students’ projects. Essentially, the educational capstone projects provide an opportunity for students to apply what they have learned and prepare themselves for industry needs. Therefore mining such projects gives insights of students’ experiences as well as industry project requirements and trends. In particular, we mine capstone projects executed by Information Systems students to discover patterns and insights related to people, organization, domain, industry needs and time. We build a capstone projects mining system (CPMS) based on classification models that leverage text mining, natural …


Multiple Instance Fuzzy Inference., Amine Ben Khalifa Dec 2015

Multiple Instance Fuzzy Inference., Amine Ben Khalifa

Electronic Theses and Dissertations

A novel fuzzy learning framework that employs fuzzy inference to solve the problem of multiple instance learning (MIL) is presented. The framework introduces a new class of fuzzy inference systems called Multiple Instance Fuzzy Inference Systems (MI-FIS). Fuzzy inference is a powerful modeling framework that can handle computing with knowledge uncertainty and measurement imprecision effectively. Fuzzy Inference performs a non-linear mapping from an input space to an output space by deriving conclusions from a set of fuzzy if-then rules and known facts. Rules can be identified from expert knowledge, or learned from data. In multiple instance problems, the training data …


Modeling Social Media Content With Word Vectors For Recommendation, Ying Ding, Jing Jiang Dec 2015

Modeling Social Media Content With Word Vectors For Recommendation, Ying Ding, Jing Jiang

Research Collection School Of Computing and Information Systems

In social media, recommender systems are becoming more and more important. Different techniques have been designed for recommendations under various scenarios, but many of them do not use user-generated content, which potentially reflects users’ opinions and interests. Although a few studies have tried to combine user-generated content with rating or adoption data, they mostly reply on lexical similarity to calculate textual similarity. However, in social media, a diverse range of words is used. This renders the traditional ways of calculating textual similarity ineffective. In this work, we apply vector representation of words to measure the semantic similarity between text. We …


Facilitating Corpus Annotation By Improving Annotation Aggregation, Paul L. Felt Dec 2015

Facilitating Corpus Annotation By Improving Annotation Aggregation, Paul L. Felt

Theses and Dissertations

Annotated text corpora facilitate the linguistic investigation of language as well as the automation of natural language processing (NLP) tasks. NLP tasks include problems such as spam email detection, grammatical analysis, and identifying mentions of people, places, and events in text. However, constructing high quality annotated corpora can be expensive. Cost can be reduced by employing low-cost internet workers in a practice known as crowdsourcing, but the resulting annotations are often inaccurate, decreasing the usefulness of a corpus. This inaccuracy is typically mitigated by collecting multiple redundant judgments and aggregating them (e.g., via majority vote) to produce high quality consensus …


Disjunctive Answer Set Solvers Via Templates, Remi Brochenin, Yuliya Lierler, Marco Maratea Nov 2015

Disjunctive Answer Set Solvers Via Templates, Remi Brochenin, Yuliya Lierler, Marco Maratea

Yuliya Lierler

Answer set programming is a declarative programming paradigm oriented towards difficult combinatorial search problems. A fundamental task in answer set programming is to compute stable models, i.e., solutions of logic programs. Answer set solvers are the programs that perform this task. The problem of deciding whether a disjunctive program has a stable model is ΣP2-complete. The high complexity of reasoning within disjunctive logic programming is responsible for few solvers capable of dealing with such programs, namely dlv, gnt, cmodels, clasp and wasp. In this paper, we show that transition systems introduced by Nieuwenhuis, Oliveras, and Tinelli to model and analyze …


Data Verifications For Online Social Networks, Mahmudur Rahman Nov 2015

Data Verifications For Online Social Networks, Mahmudur Rahman

FIU Electronic Theses and Dissertations

Social networks are popular platforms that simplify user interaction and encourage collaboration. They collect large amounts of media from their users, often reported from mobile devices. The value and impact of social media makes it however an attractive attack target. In this thesis, we focus on the following social media vulnerabilities. First, review centered social networks such as Yelp and Google Play have been shown to be the targets of significant search rank and malware proliferation attacks. Detecting fraudulent behaviors is thus paramount to prevent not only public opinion bias, but also to curb the distribution of malware. Second, the …


Exploiting Social Media Sources For Search, Fusion And Evaluation, Chia-Jung Lee Nov 2015

Exploiting Social Media Sources For Search, Fusion And Evaluation, Chia-Jung Lee

Doctoral Dissertations

The web contains heterogeneous information that is generated with different characteristics and is presented via different media. Social media, as one of the largest content carriers, has generated information from millions of users worldwide, creating material rapidly in all types of forms such as comments, images, tags, videos and ratings, etc. In social applications, the formation of online communities contributes to conversations of substantially broader aspects, as well as unfiltered opinions about subjects that are rarely covered in public media. Information accrued on social platforms, therefore, presents a unique opportunity to augment web sources such as Wikipedia or news pages, …


Interactive Machine Assistance: A Case Study In Linking Corpora And Dictionaries, Kevin P. Black Nov 2015

Interactive Machine Assistance: A Case Study In Linking Corpora And Dictionaries, Kevin P. Black

Theses and Dissertations

Machine learning can provide assistance to humans in making decisions, including linguistic decisions such as determining the part of speech of a word. Supervised machine learning methods derive patterns indicative of possible labels (decisions) from annotated example data. For many problems, including most language analysis problems, acquiring annotated data requires human annotators who are trained to understand the problem and to disambiguate among multiple possible labels. Hence, the availability of experts can limit the scope and quantity of annotated data. Machine-learned pre-annotation assistance, which suggests probable labels for unannotated items, can enable expert annotators to work more quickly and thus …


Data Selection Using Topic Adaptation For Statistical Machine Translation, Hitokazu Matsushita Nov 2015

Data Selection Using Topic Adaptation For Statistical Machine Translation, Hitokazu Matsushita

Theses and Dissertations

Statistical machine translation (SMT) requires large quantities of bitexts (i.e., bilingual parallel corpora) as training data to yield good quality translations. While obtaining a large amount of training data is critical, the similarity between training and test data also has a significant impact on SMT performance. Many SMT studies define data similarity in terms of domain-overlap, and domains are defined to be synonymous with data sources. Consequently, the SMT community has focused on domain adaptation techniques that augment small (in-domain) datasets with large datasets from other sources (hence, out-of-domain, per the definition). However, many training datasets consist of topically diverse …


Deep Multimodal Learning For Affective Analysis And Retrieval, Lei Pang, Shiai Zhu, Chong-Wah Ngo Nov 2015

Deep Multimodal Learning For Affective Analysis And Retrieval, Lei Pang, Shiai Zhu, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

Social media has been a convenient platform for voicing opinions through posting messages, ranging from tweeting a short text to uploading a media file, or any combination of messages. Understanding the perceived emotions inherently underlying these user-generated contents (UGC) could bring light to emerging applications such as advertising and media analytics. Existing research efforts on affective computation are mostly dedicated to single media, either text captions or visual content. Few attempts for combined analysis of multiple media are made, despite that emotion can be viewed as an expression of multimodal experience. In this paper, we explore the learning of highly …


Infinite-Noise Criticality: Nonequilibrium Phase Transitions In Fluctuating Environments, Thomas Vojta, José A. Hoyos Nov 2015

Infinite-Noise Criticality: Nonequilibrium Phase Transitions In Fluctuating Environments, Thomas Vojta, José A. Hoyos

Physics Faculty Research & Creative Works

We study the effects of time-varying environmental noise on nonequilibrium phase transitions in spreading and growth processes. Using the examples of the logistic evolution equation as well as the contact process, we show that such temporal disorder gives rise to a distinct type of critical points at which the effective noise amplitude diverges on long time scales. This leads to enormous density fluctuations characterized by an infinitely broad probability distribution at criticality. We develop a real-time renormalization-group theory that provides a general framework for the effects of temporal disorder on nonequilibrium processes. We also discuss how general this exotic critical …


Vireo-Tno @ Trecvid 2015: Multimedia Event Detection, Hao Zhang, Yi-Jie Lu, Maaike De Boer, Frank Ter Haar, Zhaofan Qiu, Klamer Schutte, Wessel Kraaij, Chong-Wah Ngo Nov 2015

Vireo-Tno @ Trecvid 2015: Multimedia Event Detection, Hao Zhang, Yi-Jie Lu, Maaike De Boer, Frank Ter Haar, Zhaofan Qiu, Klamer Schutte, Wessel Kraaij, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

This paper presents an overview and comparative analysis of our systems designed for the TRECVID 2015 [1] multimedia event detection (MED) task. We submitted 17 runs, of which 5 each for the zeroexample, 10-example and 100-example subtasks for the Pre-Specified (PS) event detection and 2 runs for the 10-example subtask for the Ad-Hoc (AH) event detection. We did not participate in the Interactive Run. This year we focus on three different parts of the MED task: 1) extending the size of our concept bank and combining it with improved dense trajectories; 2) exploring strategies for semantic query generation (SQG); and …


Can You Summarize This? Identifying Correlates Of Input Difficulty For Generic Multi-Document Summarization, Ani Nenkova, Annie Louis Oct 2015

Can You Summarize This? Identifying Correlates Of Input Difficulty For Generic Multi-Document Summarization, Ani Nenkova, Annie Louis

Ani Nenkova

Different summarization requirements could make the writing of a good summarymore difficult, or easier. Summary length and the characteristics of the input are such constraints influencing the quality of a potential summary. In this paper we report the results of a quantitative analysis on data from large-scale evaluations of multi-document summarization, empirically confirming this hypothesis. We further show that features measuring the cohesiveness of the input are highly correlated with eventual summary quality and that it is possible to use these as features to predict the difficulty of new, unseen, summarization inputs.


Automatic Sense Prediction For Implicit Discourse Relations In Text, Emily Pitler, Annie Louis, Ani Nenkova Oct 2015

Automatic Sense Prediction For Implicit Discourse Relations In Text, Emily Pitler, Annie Louis, Ani Nenkova

Ani Nenkova

We present a series of experiments on automatically identifying the sense of implicit discourse relations, i.e. relations that are not marked with a discourse connective such as “but” or “because”. We work with a corpus of implicit relations present in newspaper text and report results on a test set that is representative of the naturally occurring distribution of senses. We use several linguistically informed features, including polarity tags, Levin verb classes, length of verb phrases, modality, context, and lexical features. In addition, we revisit past approaches using lexical pairs from unannotated text as features, explain some of their shortcomings and …


Automatic Evaluation Of Linguistic Quality In Multi-Document Summarization, Emily Pitler, Annie Louis, Ani Nenkova Oct 2015

Automatic Evaluation Of Linguistic Quality In Multi-Document Summarization, Emily Pitler, Annie Louis, Ani Nenkova

Ani Nenkova

To date, few attempts have been made to develop and validate methods for automatic evaluation of linguistic quality in text summarization. We present the first systematic assessment of several diverse classes of metrics designed to capture various aspects of well-written text. We train and test linguistic quality models on consecutive years of NIST evaluation data in order to show the generality of results. For grammaticality, the best results come from a set of syntactic features. Focus, coherence and referential clarity are best evaluated by a class of features measuring local coherence on the basis of cosine similarity between sentences, coreference …


Structural Features For Predicting The Linguistic Quality Of Text: Applications To Machine Translation, Automatic Summarization And Human-Authored Text, Ani Nenkova, Jieun Chae, Annie Louis, Emily Pitler Oct 2015

Structural Features For Predicting The Linguistic Quality Of Text: Applications To Machine Translation, Automatic Summarization And Human-Authored Text, Ani Nenkova, Jieun Chae, Annie Louis, Emily Pitler

Ani Nenkova

Sentence structure is considered to be an important component of the overall linguistic quality of text. Yet few empirical studies have sought to characterize how and to what extent structural features determine fluency and linguistic quality. We report the results of experiments on the predictive power of syntactic phrasing statistics and other structural features for these aspects of text. Manual assessments of sentence fluency for machine translation evaluation and text quality for summarization evaluation are used as gold-standard. We find that many structural features related to phrase length are weakly but significantly correlated with fluency and classifiers based on the …


Measuring Importance And Query Relevance In Toopic-Focused Multi-Document Summarization, Surabhi Gupta, Ani Nenkova, Dan Jurafsky Oct 2015

Measuring Importance And Query Relevance In Toopic-Focused Multi-Document Summarization, Surabhi Gupta, Ani Nenkova, Dan Jurafsky

Ani Nenkova

The increasing complexity of summarization systems makes it difficult to analyze exactly which modules make a difference in performance. We carried out a principled comparison between the two most commonly used schemes for assigning importance to words in the context of query focused multi-document summarization: raw frequency (word probability) and log-likelihood ratio. We demonstrate that the advantages of log-likelihood ratio come from its known distributional properties which allow for the identification of a set of words that in its entirety defines the aboutness of the input. We also find that LLR is more suitable for query-focused summarization since, unlike raw …


Using Entity Features To Classify Implicit Discourse Relations, Annie Louis, Aravind K. Joshi, Rashmi Prasad, Ani Nenkova Oct 2015

Using Entity Features To Classify Implicit Discourse Relations, Annie Louis, Aravind K. Joshi, Rashmi Prasad, Ani Nenkova

Ani Nenkova

We report results on predicting the sense of implicit discourse relations between adjacent sentences in text. Our investigation concentrates on the association between discourse relations and properties of the referring expressions that appear in the related sentences. The properties of interest include coreference information, grammatical role, information status and syntactic form of referring expressions. Predicting the sense of implicit discourse relations based on these features is considerably better than a random baseline and several of the most discriminative features conform with linguistic intuitions. However, these features do not perform as well as lexical features traditionally used for sense prediction.


Creating Local Coherence: An Empirical Assessment, Annie Louis, Ani Nenkova Oct 2015

Creating Local Coherence: An Empirical Assessment, Annie Louis, Ani Nenkova

Ani Nenkova

Two of the mechanisms for creating natural transitions between adjacent sentences in a text, resulting in local coherence, involve discourse relations and switches of focus of attention between discourse entities. These two aspects of local coherence have been traditionally discussed and studied separately. But some empirical studies have given strong evidence for the necessity of understanding how the two types of coherence-creating devices interact. Here we present a joint corpus study of discourse relations and entity coherence exhibited in news texts from the Wall Street Journal and test several hypotheses expressed in earlier work about their interaction.


Entity-Driven Rewrite For Multi-Document Summarization, Ani Nenkova Oct 2015

Entity-Driven Rewrite For Multi-Document Summarization, Ani Nenkova

Ani Nenkova

In this paper we explore the benefits from and shortcomings of entity-driven noun phrase rewriting for multi-document summarization of news. The approach leads to 20% to 50% different content in the summary in comparison to an extractive summary produced using the same underlying approach, showing the promise the technique has to offer. In addition, summaries produced using entity-driven rewrite have higher linguistic quality than a comparison non-extractive system. Some improvement is also seen in content selection over extractive summarization as measured by pyramid method evaluation.


Formal Models Of The Extension Activity Of Dna Polymerase Enzymes, Srujan Kumar Enaganti Oct 2015

Formal Models Of The Extension Activity Of Dna Polymerase Enzymes, Srujan Kumar Enaganti

Electronic Thesis and Dissertation Repository

The study of formal language operations inspired by enzymatic actions on DNA is part of ongoing efforts to provide a formal framework and rigorous treatment of DNA-based information and DNA-based computation. Other studies along these lines include theoretical explorations of splicing systems, insertion-deletion systems, substitution, hairpin extension, hairpin reduction, superposition, overlapping concatenation, conditional concatenation, contextual intra- and intermolecular recombinations, as well as template-guided recombination.

First, a formal language operation is proposed and investigated, inspired by the naturally occurring phenomenon of DNA primer extension by a DNA-template-directed DNA polymerase enzyme. Given two DNA strings u and v, where the shorter …


A Computational Translation Of The Phaistos Disk, Peter Revesz Oct 2015

A Computational Translation Of The Phaistos Disk, Peter Revesz

CSE Conference and Workshop Papers

For over a century the text of the Phaistos Disk remained an enigma without a convincing translation. This paper presents a novel semi-automatic translation method that uses for the first time a recently discovered connection between the Phaistos Disk symbols and other ancient scripts, including the Old Hungarian alphabet. The connection between the Phaistos Disk script and the Old Hungarian alphabet suggested the possibility that the Phaistos Disk language may be related to Proto-Finno-Ugric, Proto-Ugric, or Proto-Hungarian. Using words and suffixes from those languages, it is possible to translate the Phaistos Disk text as an ancient sun hymn, possibly connected …


Implicit Information Extraction From Clinical Notes, Sujan Perera Oct 2015

Implicit Information Extraction From Clinical Notes, Sujan Perera

Kno.e.sis Publications

We address the problem of extracting implicit information from the unstructured clinical notes. Here we introduce the problem of 'implicit entity recognition in clinical notes', propose a knowledge driven approach to address this problem and demonstrate the results of our initial experiments.