Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 10 of 10

Full-Text Articles in Physical Sciences and Mathematics

Automatic Inference Of Causal Reasoning Chains From Student Essays, Simon Mark Hughes Oct 2019

Automatic Inference Of Causal Reasoning Chains From Student Essays, Simon Mark Hughes

College of Computing and Digital Media Dissertations

While there has been an increasing focus on higher-level thinking skills arising from the Common Core Standards, many high-school and middle-school students struggle to combine and integrate information from multiple sources when writing essays. Writing is an important learning skill, and there is increasing evidence that writing about a topic develops a deeper understanding in the student. However, grading essays is time consuming for teachers, resulting in an increasing focus on shallower forms of assessment that are easier to automate, such as multiple-choice tests. Existing essay grading software has attempted to ease this burden but relies on shallow lexico-syntactic features …


Use Of Text Data In Identifying And Prioritizing Potential Drug Repositioning Candidates, Majid Rastegar-Mojarad May 2019

Use Of Text Data In Identifying And Prioritizing Potential Drug Repositioning Candidates, Majid Rastegar-Mojarad

Theses and Dissertations

New drug development costs between 500 million and 2 billion dollars and takes 10-15 years, with a success rate of less than 10%. Drug repurposing (defined as discovering new indications for existing drugs) could play a significant role in drug development, especially considering the declining success rates of developing novel drugs. In the period 2007-2009, drug repurposing led to the launching of 30-40% of new drugs. Typically, new indications for existing medications are identified by accident. However, new technologies and a large number of available resources enable the development of systematic approaches to identify and validate drug-repurposing candidates with significantly …


Commonsense Knowledge In Sentiment Analysis Of Ordinance Reactions For Smart Governance, Manish Puri May 2019

Commonsense Knowledge In Sentiment Analysis Of Ordinance Reactions For Smart Governance, Manish Puri

Theses, Dissertations and Culminating Projects

Smart Governance is an emerging research area which has attracted scientific as well as policy interests, and aims to improve collaboration between government and citizens, as well as other stakeholders. Our project aims to enable lawmakers to incorporate data driven decision making in enacting ordinances. Our first objective is to create a mechanism for mapping ordinances (local laws) and tweets to Smart City Characteristics (SCC). The use of SCC has allowed us to create a mapping between a huge number of ordinances and tweets, and the use of Commonsense Knowledge (CSK) has allowed us to utilize human judgment in mapping. …


An Instruction Embedding Model For Binary Code Analysis, Kimberly Michelle Redmond Apr 2019

An Instruction Embedding Model For Binary Code Analysis, Kimberly Michelle Redmond

Theses and Dissertations

Binary code analysis is important for understanding programs without access to the original source code, which is common with proprietary software. Analyzing binaries can be challenging given their high variability: due to growth in tech manufactur- ers, source code is now frequently compiled for multiple instruction set architectures (ISAs); however, there is no formal dictionary that translates between their assem- bly languages. The difficulty of analysis is further compounded by different compiler optimizations and obfuscated malware signatures. Such minutiae means that some vulnerabilities may only be detectable on a fine-grained level. Recent strides in ma- chine learning—particularly in Natural Language …


Culture Clubs: Processing Speech By Deriving And Exploiting Linguistic Subcultures, David Guy Brizan Feb 2019

Culture Clubs: Processing Speech By Deriving And Exploiting Linguistic Subcultures, David Guy Brizan

Dissertations, Theses, and Capstone Projects

Spoken language understanding systems are error-prone for several reasons, including individual speech variability. This is manifested in many ways, among which are differences in pronunciation, lexical inventory, grammar and disfluencies. There is, however, a lot of evidence pointing to stable language usage within subgroups of a language population. We call these subgroups linguistic subcultures.

The two broad problems are defined and a survey of the work in this space is performed. The two broad problems are: linguistic subculture detection, commonly performed via Language Identification, Accent Identification or Dialect Identification approaches; and speech and language processing tasks taken which may see …


Opioid Misuse Detection In Hospitalized Patients Using Convolutional Neural Networks, Brihat Sharma Jan 2019

Opioid Misuse Detection In Hospitalized Patients Using Convolutional Neural Networks, Brihat Sharma

Master's Theses

Opioid misuse is a major public health problem in the world. In 2016, 11.3 million people were reported to misuse opioids in the US only. Opioid-related inpatient and emergency department visits have increased by 64 percent and the rate of opioid-related visits has nearly doubled between 2009 and 2014. It is thus critical for healthcare systems to detect opioid misuse cases. Patients hospitalized for consequences of their opioid misuse present an opportunity for intervention but better screening and surveillance methods are needed to guide providers. The current screening methods with self-report questionnaire data are time-consuming and difficult to perform in …


Building An Automated Q-A System Using Online Forums As Knowledge Bases, Kyle Moore Jan 2019

Building An Automated Q-A System Using Online Forums As Knowledge Bases, Kyle Moore

Electronic Theses and Dissertations

Question-Answer systems traditionally use expensive and difficult to produce structured knowledge bases. Recent systems have used unstructured natural language sources as their datasets, but most of those sources have been overly broad or difficult to extend. Online forums are a largely untapped source of information that can provide both depth and breadth when limited to a specific domain, as well as being adaptive to the introduction of new information. In this paper, I conjecture that online forums can be similarly and effectively used as an unstructured knowledge base for Question-Answer systems. I use a relatively simple summarization-based approach to analyze …


Assessing The Quality Of Software Development Tutorials Available On The Web, Manziba A. Nishi Jan 2019

Assessing The Quality Of Software Development Tutorials Available On The Web, Manziba A. Nishi

Theses and Dissertations

Both expert and novice software developers frequently access software development resources available on the Web in order to lookup or learn new APIs, tools and techniques. Software quality is affected negatively when developers fail to find high-quality information relevant to their problem. While there is a substantial amount of freely available resources that can be accessed online, some of the available resources contain information that suffers from error proneness, copyright infringement, security concerns, and incompatible versions. Use of such toxic information can have a strong negative effect on developer’s efficacy. This dissertation focuses specifically on software tutorials, aiming to automatically …


Curtus: An Nlp Tool To Map Job Skills To Academic Courses, Daniel Rockwell Jan 2019

Curtus: An Nlp Tool To Map Job Skills To Academic Courses, Daniel Rockwell

Theses and Dissertations

Many businesses are burdened with the need to train students for the job instead of finding them prepared for it. Few business leaders feel that colleges prepare students for future jobs from day one. It can be a challenge for colleges to determine if their curricula meet the industry needs. Mapping industry needs to academic courses can be advantageous to both parties as it will allow colleges to be aligned with the industry needs and accordingly satisfy those needs and will allow the industry to hire better prepared graduates. In an attempt to address this, a system prototype that uses …


Indirect Relatedness, Evaluation, And Visualization For Literature Based Discovery, Sam Henry Jan 2019

Indirect Relatedness, Evaluation, And Visualization For Literature Based Discovery, Sam Henry

Theses and Dissertations

The exponential growth of scientific literature is creating an increased need for systems to process and assimilate knowledge contained within text. Literature Based Discovery (LBD) is a well established field that seeks to synthesize new knowledge from existing literature, but it has remained primarily in the theoretical realm rather than in real-world application. This lack of real-world adoption is due in part to the difficulty of LBD, but also due to several solvable problems present in LBD today. Of these problems, the ones in most critical need of improvement are: (1) the over-generation of knowledge by LBD systems, (2) a …