Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

PDF

Chapman University

Discipline
Keyword
Publication Year
Publication
Publication Type

Articles 1 - 19 of 19

Full-Text Articles in Databases and Information Systems

Automated Identification Of Astronauts On Board The International Space Station: A Case Study In Space Archaeology, Rao Hamza Ali, Amir Kanan Kashefi, Alice C. Gorman, Justin St. P. Walsh, Erik J. Linstead Aug 2022

Automated Identification Of Astronauts On Board The International Space Station: A Case Study In Space Archaeology, Rao Hamza Ali, Amir Kanan Kashefi, Alice C. Gorman, Justin St. P. Walsh, Erik J. Linstead

Art Faculty Articles and Research

We develop and apply a deep learning-based computer vision pipeline to automatically identify crew members in archival photographic imagery taken on-board the International Space Station. Our approach is able to quickly tag thousands of images from public and private photo repositories without human supervision with high degrees of accuracy, including photographs where crew faces are partially obscured. Using the results of our pipeline, we carry out a large-scale network analysis of the crew, using the imagery data to provide novel insights into the social interactions among crew during their missions.


A Large-Scale Sentiment Analysis Of Tweets Pertaining To The 2020 Us Presidential Election, Rao Hamza Ali, Gabriela Pinto, Evelyn Lawrie, Erik J. Linstead Jun 2022

A Large-Scale Sentiment Analysis Of Tweets Pertaining To The 2020 Us Presidential Election, Rao Hamza Ali, Gabriela Pinto, Evelyn Lawrie, Erik J. Linstead

Engineering Faculty Articles and Research

We capture the public sentiment towards candidates in the 2020 US Presidential Elections, by analyzing 7.6 million tweets sent out between October 31st and November 9th, 2020. We apply a novel approach to first identify tweets and user accounts in our database that were later deleted or suspended from Twitter. This approach allows us to observe the sentiment held for each presidential candidate across various groups of users and tweets: accessible tweets and accounts, deleted tweets and accounts, and suspended or inaccessible tweets and accounts. We compare the sentiment scores calculated for these groups and provide key insights into the …


How Apis Create Growth By Inverting The Firm, Seth G. Benzell, Jonathan Hersh, Marshall Van Alstyne Mar 2022

How Apis Create Growth By Inverting The Firm, Seth G. Benzell, Jonathan Hersh, Marshall Van Alstyne

Economics Faculty Articles and Research

Traditional asset management strategy has emphasized building barriers to entry or closely guarding unique assets to maintain a firm’s comparative advantage. A new “Inverted Firm” paradigm, however, has emerged. Under this strategy, firms share data seeking to become platforms by opening digital services to third-parties and capturing part of their external surplus. This contrasts with a “pipeline” strategy where the firm itself creates value. This paper quantitatively estimates the effect of adopting an inverted firm strategy through the lens of Application Programming Interfaces (APIs), a key enabling technology. Using both public data and that of a private API development firm, …


Applications Of Unsupervised Machine Learning In Autism Spectrum Disorder Research: A Review, Chelsea Parlett-Pelleriti, Elizabeth Stevens, Dennis R. Dixon, Erik J. Linstead Jan 2022

Applications Of Unsupervised Machine Learning In Autism Spectrum Disorder Research: A Review, Chelsea Parlett-Pelleriti, Elizabeth Stevens, Dennis R. Dixon, Erik J. Linstead

Engineering Faculty Articles and Research

Large amounts of autism spectrum disorder (ASD) data is created through hospitals, therapy centers, and mobile applications; however, much of this rich data does not have pre-existing classes or labels. Large amounts of data—both genetic and behavioral—that are collected as part of scientific studies or a part of treatment can provide a deeper, more nuanced insight into both diagnosis and treatment of ASD. This paper reviews 43 papers using unsupervised machine learning in ASD, including k-means clustering, hierarchical clustering, model-based clustering, and self-organizing maps. The aim of this review is to provide a survey of the current uses of …


Pre-Earthquake Ionospheric Perturbation Identification Using Cses Data Via Transfer Learning, Pan Xiong, Cheng Long, Huiyu Zhou, Roberto Battiston, Angelo De Santis, Dimitar Ouzounov, Xuemin Zhang, Xuhui Shen Nov 2021

Pre-Earthquake Ionospheric Perturbation Identification Using Cses Data Via Transfer Learning, Pan Xiong, Cheng Long, Huiyu Zhou, Roberto Battiston, Angelo De Santis, Dimitar Ouzounov, Xuemin Zhang, Xuhui Shen

Mathematics, Physics, and Computer Science Faculty Articles and Research

During the lithospheric buildup to an earthquake, complex physical changes occur within the earthquake hypocenter. Data pertaining to the changes in the ionosphere may be obtained by satellites, and the analysis of data anomalies can help identify earthquake precursors. In this paper, we present a deep-learning model, SeqNetQuake, that uses data from the first China Seismo-Electromagnetic Satellite (CSES) to identify ionospheric perturbations prior to earthquakes. SeqNetQuake achieves the best performance [F-measure (F1) = 0.6792 and Matthews correlation coefficient (MCC) = 0.427] when directly trained on the CSES dataset with a spatial window centered on the earthquake epicenter with the Dobrovolsky …


Pitcher Effectiveness: A Step Forward For In Game Analytics And Pitcher Evaluation, Christopher Watkins, Vincent Berardi, Cyril Rakovski May 2021

Pitcher Effectiveness: A Step Forward For In Game Analytics And Pitcher Evaluation, Christopher Watkins, Vincent Berardi, Cyril Rakovski

Mathematics, Physics, and Computer Science Faculty Articles and Research

With the introduction of Statcast in 2015, baseball analytics have become more precise. Statcast allows every play to be accurately tracked and the data it generates is easily accessible through Baseball Savant, which opens the opportunity for improved performance statistics to be developed. In this paper we propose a new tool, Pitcher Effectiveness, that uses Statcast data to evaluate starting pitchers dynamically, based on the results of in-game outcomes after each pitch. Pitcher Effectiveness successfully predicts instances where starting pitchers give up several runs, which we believe make it a new and important tool for the in-game and post-game evaluation …


An Introduction To Seshat: Global History Databank, Peter Turchin, Harvey Whitehouse, Pieter François, Daniel Hoyer, Abel Alves, John Baines, David Baker, Marta Bartkowiak, Jennifer Bates, James Bennett, Julye Bidmead, Peter Bol, Alessandro Ceccarelli, Kostis Christakis, David Christian, Alan Covey, Franco De Angelis, Timothy K. Earle, Neil R. Edwards, Gary Feinman, Stephanie Grohmann, Philip B. Holden, Árni Júlíusson, Andrey Korotayev, Axel Kristinsson, Jennifer Larson, Oren Litwin, Victor Mair, Joseph G. Manning, Patrick Manning, Arkadiusz Marciniak, Gregory Mcmahon, John Miksic, Juan Carlos Moreno Garcia, Ian Morris, Ruth Mostern, Daniel Mullins, Oluwole Oyebamiji, Peter Peregrine, Cameron Petrie, Johannes Preiser-Kapeller, Peter Rudiak-Gould, Paula Sabloff, Patrick Savage, Charles Spencer, Miriam Stark, Barend Ter Haar, Stefan Thurner, Vesna Wallace, Nina Witoszek, Liye Xie Nov 2020

An Introduction To Seshat: Global History Databank, Peter Turchin, Harvey Whitehouse, Pieter François, Daniel Hoyer, Abel Alves, John Baines, David Baker, Marta Bartkowiak, Jennifer Bates, James Bennett, Julye Bidmead, Peter Bol, Alessandro Ceccarelli, Kostis Christakis, David Christian, Alan Covey, Franco De Angelis, Timothy K. Earle, Neil R. Edwards, Gary Feinman, Stephanie Grohmann, Philip B. Holden, Árni Júlíusson, Andrey Korotayev, Axel Kristinsson, Jennifer Larson, Oren Litwin, Victor Mair, Joseph G. Manning, Patrick Manning, Arkadiusz Marciniak, Gregory Mcmahon, John Miksic, Juan Carlos Moreno Garcia, Ian Morris, Ruth Mostern, Daniel Mullins, Oluwole Oyebamiji, Peter Peregrine, Cameron Petrie, Johannes Preiser-Kapeller, Peter Rudiak-Gould, Paula Sabloff, Patrick Savage, Charles Spencer, Miriam Stark, Barend Ter Haar, Stefan Thurner, Vesna Wallace, Nina Witoszek, Liye Xie

Religious Studies Faculty Articles and Research

This article introduces the Seshat: Global History Databank, its potential, and its methodology. Seshat is a databank containing vast amounts of quantitative data buttressed by qualitative nuance for a large sample of historical and archaeological polities. The sample is global in scope and covers the period from the Neolithic Revolution to the Industrial Revolution. Seshat allows scholars to capture dynamic processes and to test theories about the co-evolution (or not) of social scale and complexity, agriculture, warfare, religion, and any number of such Big Questions. Seshat is rapidly becoming a massive resource for innovative cross-cultural and cross-disciplinary research. Seshat is …


Patterns Of Population Displacement During Mega-Fires In California Detected Using Facebook Disaster Maps, Shenyue Jia, Seung Hee Kim, Son V. Nghiem, Paul Doherty, Menas Kafatos Jul 2020

Patterns Of Population Displacement During Mega-Fires In California Detected Using Facebook Disaster Maps, Shenyue Jia, Seung Hee Kim, Son V. Nghiem, Paul Doherty, Menas Kafatos

Mathematics, Physics, and Computer Science Faculty Articles and Research

The Facebook Disaster Maps (FBDM) work presented here is the first time this platform has been used to provide analysis-ready population change products derived from crowdsourced data targeting disaster relief practices. We evaluate the representativeness of FBDM data using the Mann-Kendall test and emerging hot and cold spots in an anomaly analysis to reveal the trend, magnitude, and agglommeration of population displacement during the Mendocino Complex and Woolsey fires in California, USA. Our results show that the distribution of FBDM pre-crisis users fits well with the total population from different sources. Due to usage habits, the elder population is underrepresented …


A 12-Lead Ecg Database To Identify Origins Of Idiopathic Ventricular Arrhythmia Containing 334 Patients, Jianwei Zhang, Guohua Fu, Kyle Anderson, Huimin Chu, Cyril Rakovski Mar 2020

A 12-Lead Ecg Database To Identify Origins Of Idiopathic Ventricular Arrhythmia Containing 334 Patients, Jianwei Zhang, Guohua Fu, Kyle Anderson, Huimin Chu, Cyril Rakovski

Mathematics, Physics, and Computer Science Faculty Articles and Research

Cardiac catheter ablation has shown the effectiveness of treating the idiopathic premature ventricular complex and ventricular tachycardia. As the most important prerequisite for successful therapy, criteria based on analysis of 12-lead ECGs are employed to reliably speculate the locations of idiopathic ventricular arrhythmia before a subsequent catheter ablation procedure. Among these possible locations, right ventricular outflow tract and left outflow tract are the major ones. We created a new 12-lead ECG database under the auspices of Chapman University and Ningbo First Hospital of Zhejiang University that aims to provide high quality data enabling detection of the distinctions between idiopathic ventricular …


A Nwb-Based Dataset And Processing Pipeline Of Human Single-Neuron Activity During A Declarative Memory Task, N. Chandravadia, D. Liang, A. G. P. Schjetnan, A. Carlson, M. Faraut, J. M. Chung, C. M. Reed, B. Dichter, Uri Maoz, S. K. Kalia, T. A. Valiante, A. N. Mamelak, U. Rutishauser Mar 2020

A Nwb-Based Dataset And Processing Pipeline Of Human Single-Neuron Activity During A Declarative Memory Task, N. Chandravadia, D. Liang, A. G. P. Schjetnan, A. Carlson, M. Faraut, J. M. Chung, C. M. Reed, B. Dichter, Uri Maoz, S. K. Kalia, T. A. Valiante, A. N. Mamelak, U. Rutishauser

Psychology Faculty Articles and Research

A challenge for data sharing in systems neuroscience is the multitude of different data formats used. Neurodata Without Borders: Neurophysiology 2.0 (NWB:N) has emerged as a standardized data format for the storage of cellular-level data together with meta-data, stimulus information, and behavior. A key next step to facilitate NWB:N adoption is to provide easy to use processing pipelines to import/export data from/to NWB:N. Here, we present a NWB-formatted dataset of 1863 single neurons recorded from the medial temporal lobes of 59 human subjects undergoing intracranial monitoring while they performed a recognition memory task. We provide code to analyze and export/import …


Tcp Server And Client: Bookstore Enquiry, Fawaz Bukhowa Dec 2018

Tcp Server And Client: Bookstore Enquiry, Fawaz Bukhowa

Student Scholar Symposium Abstracts and Posters

An application called "Bookstore Enquiry", and it is implemented in Java using TCP client-server program. It contains two programs; one program is called "Server" and another one is called "Client". In this application, the 'server' maintains information about books and for each book it stores information like 'BookId', 'BookName', 'BookEdition', 'AvailableStock', 'UnitPrice', 'Discount'. This application works in such a way that, the server runs indefinitely and waits for client requests. The Client will accept the BookId & BookName from console and send it to server. If the server finds any books that matches with sent details, then it shows "BOOK …


Forcing Optimality And Brandt's Principle, Domenico Napoletani, Marco Panza, Daniele C. Struppa Jan 2017

Forcing Optimality And Brandt's Principle, Domenico Napoletani, Marco Panza, Daniele C. Struppa

Mathematics, Physics, and Computer Science Faculty Books and Book Chapters

We argue that many optimization methods can be viewed as representatives of “forcing”, a methodological approach that attempts to bridge the gap between data and mathematics on the basis of an a priori trust in the power of a mathematical technique, even when detailed, credible models of a phenomenon are lacking or do not justify the use of this technique. In particular, we show that forcing is implied in particle swarms optimization methods, and in modeling image processing problems through optimization. From these considerations, we extrapolate a principle for general data analysis methods, what we call ‘Brandt’s principle’, namely the …


An Application Of The Autism Management Platform To Tracking Student Progress In The Special Education Environment, Ryan Thomas Burns Jan 2015

An Application Of The Autism Management Platform To Tracking Student Progress In The Special Education Environment, Ryan Thomas Burns

Computational and Data Sciences Theses

In the age of online courses and digital textbooks, several areas of academia, such as special education, are far behind in the technological revolution. Some teachers use long unstructured digital documents, while others maintain large physical files for students containing every piece of information or coursework they have ever received. Could these extremely unstructured approaches to data collection and aggregation be streamlined with a software platform built specifically for this purpose? Could this platform also be built to accommodate multiple integrations and practical new features? Most importantly, in terms of usability, would this software be enjoyable to use? The Autism …


Design Of Randomized Experiments In Networks, Dylan Walker, Lev Muchnik Nov 2014

Design Of Randomized Experiments In Networks, Dylan Walker, Lev Muchnik

Business Faculty Articles and Research

Over the last decade, the emergence of pervasive online and digitally enabled environments has created a rich source of detailed data on human behavior. Yet, the promise of big data has recently come under fire for its inability to separate correlation from causation-to derive actionable insights and yield effective policies. Fortunately, the same online platforms on which we interact on a day-to-day basis permit experimentation at large scales, ushering in a new movement toward big experiments. Randomized controlled trials are the heart of the scientific method and when designed correctly provide clean causal inferences that are robust and reproducible. However, …


Improving The Efficacy Of Web-Based Educational Outreach In Ecology, Gregory R. Goldsmith, Andrew D. Fulton, Colin D. Witherill, Javier F. Espeleta Oct 2014

Improving The Efficacy Of Web-Based Educational Outreach In Ecology, Gregory R. Goldsmith, Andrew D. Fulton, Colin D. Witherill, Javier F. Espeleta

Biology, Chemistry, and Environmental Sciences Faculty Articles and Research

Scientists are increasingly engaging the web to provide formal and informal science education opportunities. Despite the prolific growth of web-based resources, systematic evaluation and assessment of their efficacy remains limited. We used clickstream analytics, a widely available method for tracking website visitors and their behavior, to evaluate 60,000 visits over three years to an educational website focused on ecology. Visits originating from search engine queries were a small proportion of the traffic, suggesting the need to actively promote websites to drive visitation. However, the number of visits referred to the website per social media post varied depending on the social …


Computational Methods For Historical Research On Wikipedia’S Archives, Jonathan Cohen Sep 2014

Computational Methods For Historical Research On Wikipedia’S Archives, Jonathan Cohen

e-Research: A Journal of Undergraduate Work

This paper presents a novel study of geographic information implicit in the English Wikipedia archive. This project demonstrates a method to extract data from the archive with data mining, map the global distribution of Wikipedia editors through geocoding in GIS, and proceed with a spatial analysis of Wikipedia use in metropolitan cities.


Building A Computer Program To Support Children, Parents, And Distraction During Healthcare Procedures, Kirsten Hanrahan, Ann Marie Mccarthy, Charmaine Kleiber, Kaan Ataman, W. Nick Street, M. Bridget Zimmerman, Annel L. Ersig Oct 2012

Building A Computer Program To Support Children, Parents, And Distraction During Healthcare Procedures, Kirsten Hanrahan, Ann Marie Mccarthy, Charmaine Kleiber, Kaan Ataman, W. Nick Street, M. Bridget Zimmerman, Annel L. Ersig

Business Faculty Articles and Research

This secondary data analysis used data mining methods to develop predictive models of child risk for distress during a healthcare procedure. Data used came from a study that predicted factors associated with children's responses to an intravenous catheter insertion while parents provided distraction coaching. From the 255 items used in the primary study, 44 predictive items were identified through automatic feature selection and used to build support vector machine regression models. Models were validated using multiple cross-validation tests and by comparing variables identified as explanatory in the traditional versus support vector machine regression. Rule-based approaches were applied to the model …


Identifying Social Influence In Networks Using Randomized Experiments, Sinan Aral, Dylan Walker Oct 2011

Identifying Social Influence In Networks Using Randomized Experiments, Sinan Aral, Dylan Walker

Business Faculty Articles and Research

The recent availability of massive amounts of networked data generated by email, instant messaging, mobile phone communications, micro blogs, and online social networks is enabling studies of population-level human interaction on scales orders of magnitude greater than what was previously possible.1'2 One important goal of applying statistical inference techniques to large networked datasets is to understand how behavioral contagions spread in human social networks. More precisely, understanding how people influence or are influenced by their peers can help us understand the ebb and flow of market trends, product adoption and diffusion, the spread of health behaviors such as smoking and …


Structure Of The Information Base And Operations On The Entities In A System For Information Servicing Of Collectivities, Peter H. Barnev, Atanas Radenski Jan 1978

Structure Of The Information Base And Operations On The Entities In A System For Information Servicing Of Collectivities, Peter H. Barnev, Atanas Radenski

Mathematics, Physics, and Computer Science Faculty Articles and Research

This paper treats, from an user point of view, the structure of the entities in the information base (IB) of a System for information servicing of collectivities (SISC) as well as the operations on these entries.