Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 19 of 19

Full-Text Articles in Physical Sciences and Mathematics

Applying Data Science And Machine Learning To Understand Health Care Transition For Adolescents And Emerging Adults With Special Health Care Needs, Lisamarie Turk Dec 2022

Applying Data Science And Machine Learning To Understand Health Care Transition For Adolescents And Emerging Adults With Special Health Care Needs, Lisamarie Turk

Nursing ETDs

A problem of classification places adolescents and emerging adults with special health care needs among the most at risk for poor or life-threatening health outcomes. This preliminary proof-of-concept study was conducted to determine if phenotypes of health care transition (HCT) for this vulnerable population could be established. Such phenotypes could support development of future studies that require data classifications as input. Mining of electronic health record data and cluster analysis were implemented to identify phenotypes. Subsequently, a machine learning concept model was developed for predicting acute care and medical condition severity. Three clusters were identified and described (Cluster 1, n …


Creating Data From Unstructured Text With Context Rule Assisted Machine Learning (Craml), Stephen Meisenbacher, Peter Norlander Dec 2022

Creating Data From Unstructured Text With Context Rule Assisted Machine Learning (Craml), Stephen Meisenbacher, Peter Norlander

School of Business: Faculty Publications and Other Works

Popular approaches to building data from unstructured text come with limitations, such as scalability, interpretability, replicability, and real-world applicability. These can be overcome with Context Rule Assisted Machine Learning (CRAML), a method and no-code suite of software tools that builds structured, labeled datasets which are accurate and reproducible. CRAML enables domain experts to access uncommon constructs within a document corpus in a low-resource, transparent, and flexible manner. CRAML produces document-level datasets for quantitative research and makes qualitative classification schemes scalable over large volumes of text. We demonstrate that the method is useful for bibliographic analysis, transparent analysis of proprietary data, …


Investigating Toxicity Changes Of Cross-Community Redditors From 2 Billion Posts And Comments, Hind Almerekhi, Haewoon Kwak, Bernard J. Jansen Aug 2022

Investigating Toxicity Changes Of Cross-Community Redditors From 2 Billion Posts And Comments, Hind Almerekhi, Haewoon Kwak, Bernard J. Jansen

Research Collection School Of Computing and Information Systems

This research investigates changes in online behavior of users who publish in multiple communities on Reddit by measuring their toxicity at two levels. With the aid of crowdsourcing, we built a labeled dataset of 10,083 Reddit comments, then used the dataset to train and fine-tune a Bidirectional Encoder Representations from Transformers (BERT) neural network model. The model predicted the toxicity levels of 87,376,912 posts from 577,835 users and 2,205,581,786 comments from 890,913 users on Reddit over 16 years, from 2005 to 2020. This study utilized the toxicity levels of user content to identify toxicity changes by the user within the …


Emotion Detection Using An Ensemble Model Trained With Physiological Signals And Inferred Arousal-Valence States, Matthew Nathanael Gray Aug 2022

Emotion Detection Using An Ensemble Model Trained With Physiological Signals And Inferred Arousal-Valence States, Matthew Nathanael Gray

Electrical & Computer Engineering Theses & Dissertations

Affective computing is an exciting and transformative field that is gaining in popularity among psychologists, statisticians, and computer scientists. The ability of a machine to infer human emotion and mood, i.e. affective states, has the potential to greatly improve human-machine interaction in our increasingly digital world. In this work, an ensemble model methodology for detecting human emotions across multiple subjects is outlined. The Continuously Annotated Signals of Emotion (CASE) dataset, which is a dataset of physiological signals labeled with discrete emotions from video stimuli as well as subject-reported continuous emotions, arousal and valence, from the circumplex model, is used for …


Imagining New Futures Beyond Predictive Systems In Child Welfare: A Qualitative Study With Impacted Stakeholders, Logan Stapleton, Min Hun Lee, Diana Qing, Marya Wright, Alexandra Chouldechova, Ken Holstein, Zhiwei Steven Wu, Haiyi Zhu Jun 2022

Imagining New Futures Beyond Predictive Systems In Child Welfare: A Qualitative Study With Impacted Stakeholders, Logan Stapleton, Min Hun Lee, Diana Qing, Marya Wright, Alexandra Chouldechova, Ken Holstein, Zhiwei Steven Wu, Haiyi Zhu

Research Collection School Of Computing and Information Systems

Child welfare agencies across the United States are turning to datadriven predictive technologies (commonly called predictive analytics) which use government administrative data to assist workers’ decision-making. While some prior work has explored impacted stakeholders’ concerns with current uses of data-driven predictive risk models (PRMs), less work has asked stakeholders whether such tools ought to be used in the first place. In this work, we conducted a set of seven design workshops with 35 stakeholders who have been impacted by the child welfare system or who work in it to understand their beliefs and concerns around PRMs, and to engage them …


Exploring The Effectiveness Of Multiple-Exemplar Training For Visual Analysis Of Ab-Design Graphs, Verena S. Bethke Jun 2022

Exploring The Effectiveness Of Multiple-Exemplar Training For Visual Analysis Of Ab-Design Graphs, Verena S. Bethke

Dissertations, Theses, and Capstone Projects

In behavior analysis, data are usually analyzed using visual analysis of the graphed data. There are a wide range of methods used to visually analyze data, from a basic ‘textbook’ style approach to the use of visual aids, decision-rubrics, and computer-based approaches. In the literature, there have been some comparisons of the efficacy of different approaches. Visual analysis as a behavior can be taught using a variety of methods, independent of how the skill itself is to be performed. Teaching methods include lecture, online instruction, and equivalence-based instruction. There is not much research on the teaching of visual analysis specifically, …


Data-Driven Framework For Understanding & Modeling Ride-Sourcing Transportation Systems, Bishoy Kelleny May 2022

Data-Driven Framework For Understanding & Modeling Ride-Sourcing Transportation Systems, Bishoy Kelleny

Civil & Environmental Engineering Theses & Dissertations

Ride-sourcing transportation services offered by transportation network companies (TNCs) like Uber and Lyft are disrupting the transportation landscape. The growing demand on these services, along with their potential short and long-term impacts on the environment, society, and infrastructure emphasize the need to further understand the ride-sourcing system. There were no sufficient data to fully understand the system and integrate it within regional multimodal transportation frameworks. This can be attributed to commercial and competition reasons, given the technology-enabled and innovative nature of the system. Recently, in 2019, the City of Chicago the released an extensive and complete ride-sourcing trip-level data for …


Toward Suicidal Ideation Detection With Lexical Network Features And Machine Learning, Ulya Bayram, William Lee, Daniel Santel, Ali Minai, Peggy Clark, Tracy Glauser, John Pestian Apr 2022

Toward Suicidal Ideation Detection With Lexical Network Features And Machine Learning, Ulya Bayram, William Lee, Daniel Santel, Ali Minai, Peggy Clark, Tracy Glauser, John Pestian

Northeast Journal of Complex Systems (NEJCS)

In this study, we introduce a new network feature for detecting suicidal ideation from clinical texts and conduct various additional experiments to enrich the state of knowledge. We evaluate statistical features with and without stopwords, use lexical networks for feature extraction and classification, and compare the results with standard machine learning methods using a logistic classifier, a neural network, and a deep learning method. We utilize three text collections. The first two contain transcriptions of interviews conducted by experts with suicidal (n=161 patients that experienced severe ideation) and control subjects (n=153). The third collection consists of interviews conducted by experts …


A Remote Sensing And Machine Learning-Based Approach To Forecast The Onset Of Harmful Algal Bloom (Red Tides), Moein Izadi Apr 2022

A Remote Sensing And Machine Learning-Based Approach To Forecast The Onset Of Harmful Algal Bloom (Red Tides), Moein Izadi

Dissertations

In the last few decades, harmful algal blooms (HABs, also known as “red tides”) have become one of the most detrimental natural phenomena all around the world especially in Florida’s coastal areas due to local environmental factors and global warming in a larger scale. Karenia brevis produces toxins that have harmful effects on humans, fisheries, and ecosystems. In this study, I developed and compared the efficiency of state-of-the-art machine learning models (e.g., XGBoost, Random Forest, and Support Vector Machine) in predicting the occurrence of HABs. In the proposed models, the K. brevis abundance is used as the target, and 10 …


Moving Toward Personalized Law, Cary Coglianese Mar 2022

Moving Toward Personalized Law, Cary Coglianese

All Faculty Scholarship

Rules operate as a tool of governance by making generalizations, thereby cutting down on government officials’ need to make individual determinations. But because they are generalizations, rules can result in inefficient or perverse outcomes due to their over- and under-inclusiveness. With the aid of advances in machine-learning algorithms, however, it is becoming increasingly possible to imagine governments shifting away from a predominant reliance on general rules and instead moving toward increased reliance on precise individual determinations—or on “personalized law,” to use the term Omri Ben-Shahar and Ariel Porat use in the title of their 2021 book. Among the various technological, …


Landslide Detection In The Himalayas Using Machine Learning Algorithms And U-Net, Sansar Raj Meena, Lucas Pedrosa Soares, Carlos H. Grohmann, Cees Van Westen, Kushanav Bhuyan, Ramesh P. Singh, Mario Floris, Filippo Catani Feb 2022

Landslide Detection In The Himalayas Using Machine Learning Algorithms And U-Net, Sansar Raj Meena, Lucas Pedrosa Soares, Carlos H. Grohmann, Cees Van Westen, Kushanav Bhuyan, Ramesh P. Singh, Mario Floris, Filippo Catani

Biology, Chemistry, and Environmental Sciences Faculty Articles and Research

Event-based landslide inventories are essential sources to broaden our understanding of the causal relationship between triggering events and the occurring landslides. Moreover, detailed inventories are crucial for the succeeding phases of landslide risk studies like susceptibility and hazard assessment. The openly available inventories differ in the quality and completeness levels. Event-based landslide inventories are created based on manual interpretation, and there can be significant differences in the mapping preferences among interpreters. To address this issue, we used two different datasets to analyze the potential of U-Net and machine learning approaches for automated landslide detection in the Himalayas. Dataset-1 is composed …


Algorithm Vs. Algorithm, Cary Coglianese, Alicia Lai Jan 2022

Algorithm Vs. Algorithm, Cary Coglianese, Alicia Lai

All Faculty Scholarship

Critics raise alarm bells about governmental use of digital algorithms, charging that they are too complex, inscrutable, and prone to bias. A realistic assessment of digital algorithms, though, must acknowledge that government is already driven by algorithms of arguably greater complexity and potential for abuse: the algorithms implicit in human decision-making. The human brain operates algorithmically through complex neural networks. And when humans make collective decisions, they operate via algorithms too—those reflected in legislative, judicial, and administrative processes. Yet these human algorithms undeniably fail and are far from transparent. On an individual level, human decision-making suffers from memory limitations, fatigue, …


A Synthetic Prediction Market For Estimating Confidence In Published Work, Sarah Rajtmajer, Christopher Griffin, Jian Wu, Robert Fraleigh, Laxmann Balaji, Anna Squicciarini, Anthony Kwasnica, David Pennock, Michael Mclaughlin, Timothy Fritton, Nishanth Nakshatri, Arjun Menon, Sai Ajay Modukuri, Rajal Nivargi, Xin Wei, Lee Giles Jan 2022

A Synthetic Prediction Market For Estimating Confidence In Published Work, Sarah Rajtmajer, Christopher Griffin, Jian Wu, Robert Fraleigh, Laxmann Balaji, Anna Squicciarini, Anthony Kwasnica, David Pennock, Michael Mclaughlin, Timothy Fritton, Nishanth Nakshatri, Arjun Menon, Sai Ajay Modukuri, Rajal Nivargi, Xin Wei, Lee Giles

Computer Science Faculty Publications

[First paragraph] Concerns about the replicability, robustness and reproducibility of findings in scientific literature have gained widespread attention over the last decade in the social sciences and beyond. This attention has been catalyzed by and has likewise motivated a number of large-scale replication projects which have reported successful replication rates between 36% and 78%. Given the challenges and resources required to run high-powered replication studies, researchers have sought other approaches to assess confidence in published claims. Initial evidence has supported the promise of prediction markets in this context. However, they require the coordinated, sustained effort of collections of human experts …


Predicting League Of Legends Ranked Games Outcome, Ngoc Linh Chi Nguyen Jan 2022

Predicting League Of Legends Ranked Games Outcome, Ngoc Linh Chi Nguyen

Senior Projects Spring 2022

League of Legends (LoL) is the one of most popular multiplayer online battle arena (MOBA) games in the world. For LoL, the most competitive way to evaluate a player’s skill level, below the professional Esports level, is competitive ranked games. These ranked games utilize a matchmaking system based on the player’s ranks to form a fair team for each game. However, a rank game's outcome cannot necessarily be predicted using just players’ ranks, there are a significant number of different variables impacting a rank game depending on how well each team plays. In this paper, I propose a method to …


From Negative To Positive Algorithm Rights, Cary Coglianese, Kat Hefter Jan 2022

From Negative To Positive Algorithm Rights, Cary Coglianese, Kat Hefter

All Faculty Scholarship

Artificial intelligence, or “AI,” is raising alarm bells. Advocates and scholars propose policies to constrain or even prohibit certain AI uses by governmental entities. These efforts to establish a negative right to be free from AI stem from an understandable motivation to protect the public from arbitrary, biased, or unjust applications of algorithms. This movement to enshrine protective rights follows a familiar pattern of suspicion that has accompanied the introduction of other technologies into governmental processes. Sometimes this initial suspicion of a new technology later transforms into widespread acceptance and even a demand for its use. In this paper, we …


Exploring Cyberterrorism, Topic Models And Social Networks Of Jihadists Dark Web Forums: A Computational Social Science Approach, Vivian Fiona Guetler Jan 2022

Exploring Cyberterrorism, Topic Models And Social Networks Of Jihadists Dark Web Forums: A Computational Social Science Approach, Vivian Fiona Guetler

Graduate Theses, Dissertations, and Problem Reports

This three-article dissertation focuses on cyber-related topics on terrorist groups, specifically Jihadists’ use of technology, the application of natural language processing, and social networks in analyzing text data derived from terrorists' Dark Web forums. The first article explores cybercrime and cyberterrorism. As technology progresses, it facilitates new forms of behavior, including tech-related crimes known as cybercrime and cyberterrorism. In this article, I provide an analysis of the problems of cybercrime and cyberterrorism within the field of criminology by reviewing existing literature focusing on (a) the issues in defining terrorism, cybercrime, and cyberterrorism, (b) ways that cybercriminals commit a crime in …


Machine Learning Land Cover And Land Use Classification Of 4-Band Satellite Imagery, Lorelei Turner [*], Torrey J. Wagner, Paul Auclair, Brent T. Langhals Jan 2022

Machine Learning Land Cover And Land Use Classification Of 4-Band Satellite Imagery, Lorelei Turner [*], Torrey J. Wagner, Paul Auclair, Brent T. Langhals

Faculty Publications

Land-cover and land-use classification generates categories of terrestrial features, such as water or trees, which can be used to track how land is used. This work applies classical, ensemble and neural network machine learning algorithms to a multispectral remote sensing dataset containing 405,000 28x28 pixel image patches in 4 electromagnetic frequency bands. For each algorithm, model metrics and prediction execution time were evaluated, resulting in two families of models; fast and precise. The prediction time for an 81,000-patch group of predictions wasmodels, and >5s for the precise models, and there was not a significant change in prediction time when a …


Taming The Data In The Internet Of Vehicles, Shahab Tayeb Jan 2022

Taming The Data In The Internet Of Vehicles, Shahab Tayeb

Mineta Transportation Institute

As an emerging field, the Internet of Vehicles (IoV) has a myriad of security vulnerabilities that must be addressed to protect system integrity. To stay ahead of novel attacks, cybersecurity professionals are developing new software and systems using machine learning techniques. Neural network architectures improve such systems, including Intrusion Detection System (IDSs), by implementing anomaly detection, which differentiates benign data packets from malicious ones. For an IDS to best predict anomalies, the model is trained on data that is typically pre-processed through normalization and feature selection/reduction. These pre-processing techniques play an important role in training a neural network to optimize …


Antitrust By Algorithm, Cary Coglianese, Alicia Lai Jan 2022

Antitrust By Algorithm, Cary Coglianese, Alicia Lai

All Faculty Scholarship

Technological innovation is changing private markets around the world. New advances in digital technology have created new opportunities for subtle and evasive forms of anticompetitive behavior by private firms. But some of these same technological advances could also help antitrust regulators improve their performance in detecting and responding to unlawful private conduct. We foresee that the growing digital complexity of the marketplace will necessitate that antitrust authorities increasingly rely on machine-learning algorithms to oversee market behavior. In making this transition, authorities will need to meet several key institutional challenges—building organizational capacity, avoiding legal pitfalls, and establishing public trust—to ensure successful …