Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 11 of 11

Full-Text Articles in Physical Sciences and Mathematics

Language Models For Rare Disease Information Extraction: Empirical Insights And Model Comparisons, Shashank Gupta Jan 2024

Language Models For Rare Disease Information Extraction: Empirical Insights And Model Comparisons, Shashank Gupta

Theses and Dissertations--Computer Science

End-to-end relation extraction (E2ERE) is a crucial task in natural language processing (NLP) that involves identifying and classifying semantic relationships between entities in text. This thesis compares three paradigms for end-to-end relation extraction (E2ERE) in biomedicine, focusing on rare diseases with discontinuous and nested entities. We evaluate Named Entity Recognition (NER) to Relation Extraction (RE) pipelines, sequence-to-sequence models, and generative pre-trained transformer (GPT) models using the RareDis information extraction dataset. Our findings indicate that pipeline models are the most effective, followed closely by sequence-to-sequence models. GPT models, despite having eight times as many parameters, perform worse than sequence-to-sequence models and …


Improving Connectivity For Remote Cancer Patient Symptom Monitoring And Reporting In Rural Medically Underserved Regions, Esther Max-Onakpoya Jan 2023

Improving Connectivity For Remote Cancer Patient Symptom Monitoring And Reporting In Rural Medically Underserved Regions, Esther Max-Onakpoya

Theses and Dissertations--Computer Science

Rural residents are often faced with many disparities when compared to their urban counterparts. Two key areas where these disparities are apparent are access to health and Internet services. Improved access to healthcare services has the potential to increase residents' quality of life and life expectancy. Additionally, improved access to Internet services can create significant social returns in increasing job and educational opportunities, and improving access to healthcare. Therefore, this dissertation focuses on the intersection between access to Internet and healthcare services in rural areas. More specifically, it attempts to analyze systems that can be used to improve Internet access …


Expanding Social Network Modeling Software And Agent Models For Diffusion Processes, Patrick Vaden Shepherd Jan 2021

Expanding Social Network Modeling Software And Agent Models For Diffusion Processes, Patrick Vaden Shepherd

Theses and Dissertations--Computer Science

In an increasingly digitally interconnected world, the study of social networks and their dynamics is burgeoning. Anthropologically, the ubiquity of online social networks has had striking implications for the condition of large portions of humanity. This technology has facilitated content creation of virtually all sorts, information sharing on an unprecedented scale, and connections and communities among people with similar interests and skills. The first part of my research is a social network evolution and visualization engine. Built on top of existing technologies, my software is designed to provide abstractions from the underlying libraries, drive real-time network evolution based on user-defined …


Confprofitt: A Configuration-Aware Performance Profiling, Testing, And Tuning Framework, Xue Han Jan 2019

Confprofitt: A Configuration-Aware Performance Profiling, Testing, And Tuning Framework, Xue Han

Theses and Dissertations--Computer Science

Modern computer software systems are complicated. Developers can change the behavior of the software system through software configurations. The large number of configuration option and their interactions make the task of software tuning, testing, and debugging very challenging. Performance is one of the key aspects of non-functional qualities, where performance bugs can cause significant performance degradation and lead to poor user experience. However, performance bugs are difficult to expose, primarily because detecting them requires specific inputs, as well as specific configurations. While researchers have developed techniques to analyze, quantify, detect, and fix performance bugs, many of these techniques are not …


Learning To Map The Visual And Auditory World, Tawfiq Salem Jan 2019

Learning To Map The Visual And Auditory World, Tawfiq Salem

Theses and Dissertations--Computer Science

The appearance of the world varies dramatically not only from place to place but also from hour to hour and month to month. Billions of images that capture this complex relationship are uploaded to social-media websites every day and often are associated with precise time and location metadata. This rich source of data can be beneficial to improve our understanding of the globe. In this work, we propose a general framework that uses these publicly available images for constructing dense maps of different ground-level attributes from overhead imagery. In particular, we use well-defined probabilistic models and a weakly-supervised, multi-task training …


Context-Aware Debugging For Concurrent Programs, Justin Chu Jan 2017

Context-Aware Debugging For Concurrent Programs, Justin Chu

Theses and Dissertations--Computer Science

Concurrency faults are difficult to reproduce and localize because they usually occur under specific inputs and thread interleavings. Most existing fault localization techniques focus on sequential programs but fail to identify faulty memory access patterns across threads, which are usually the root causes of concurrency faults. Moreover, existing techniques for sequential programs cannot be adapted to identify faulty paths in concurrent programs. While concurrency fault localization techniques have been proposed to analyze passing and failing executions obtained from running a set of test cases to identify faulty access patterns, they primarily focus on using statistical analysis. We present a novel …


Data Persistence In Eiffel, Jimmy J. Johnson Jan 2016

Data Persistence In Eiffel, Jimmy J. Johnson

Theses and Dissertations--Computer Science

This dissertation describes an extension to the Eiffel programming language that provides automatic object persistence (the ability of programs to store objects and later recreate those objects in a subsequent execution of a program). The mechanism is orthogonal to other aspects of the Eiffel language. The mechanism serves four main purposes: 1) it gives Eiffel programmers a needed service, filling a gap between serialization, which provides limited persistence functions and database-mapping, which is cumbersome to use; 2) it greatly reduces the coding burden incurred by the programmer when objects must persist, allowing the programmer to focus instead on the business …


An Optical Character Recognition Engine For Graphical Processing Units, Jeremy Reed Jan 2016

An Optical Character Recognition Engine For Graphical Processing Units, Jeremy Reed

Theses and Dissertations--Computer Science

This dissertation investigates how to build an optical character recognition engine (OCR) for a graphical processing unit (GPU). I introduce basic concepts for both building an OCR engine and for programming on the GPU. I then describe the SegRec algorithm in detail and discuss my findings.


Consistency Checking Of Natural Language Temporal Requirements Using Answer-Set Programming, Wenbin Li Jan 2015

Consistency Checking Of Natural Language Temporal Requirements Using Answer-Set Programming, Wenbin Li

Theses and Dissertations--Computer Science

Successful software engineering practice requires high quality requirements. Inconsistency is one of the main requirement issues that may prevent software projects from being success. This is particularly onerous when the requirements concern temporal constraints. Manual checking whether temporal requirements are consistent is tedious and error prone when the number of requirements is large. This dissertation addresses the problem of identifying inconsistencies in temporal requirements expressed as natural language text. The goal of this research is to create an efficient, partially automated, approach for checking temporal consistency of natural language requirements and to minimize analysts' workload.

The key contributions of this …


A Fault-Based Model Of Fault Localization Techniques, Mark A. Hays Jan 2014

A Fault-Based Model Of Fault Localization Techniques, Mark A. Hays

Theses and Dissertations--Computer Science

Every day, ordinary people depend on software working properly. We take it for granted; from banking software, to railroad switching software, to flight control software, to software that controls medical devices such as pacemakers or even gas pumps, our lives are touched by software that we expect to work. It is well known that the main technique/activity used to ensure the quality of software is testing. Often it is the only quality assurance activity undertaken, making it that much more important.

In a typical experiment studying these techniques, a researcher will intentionally seed a fault (intentionally breaking the functionality of …


Application Of Swarm And Reinforcement Learning Techniques To Requirements Tracing, Hakim Sultanov Jan 2013

Application Of Swarm And Reinforcement Learning Techniques To Requirements Tracing, Hakim Sultanov

Theses and Dissertations--Computer Science

Today, software has become deeply woven into the fabric of our lives. The quality of the software we depend on needs to be ensured at every phase of the Software Development Life Cycle (SDLC). An analyst uses the requirements engineering process to gather and analyze system requirements in the early stages of the SDLC. An undetected problem at the beginning of the project can carry all the way through to the deployed product.

The Requirements Traceability Matrix (RTM) serves as a tool to demonstrate how requirements are addressed by the design and implementation elements throughout the entire software development lifecycle. …