Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

Data analysis

Institution
Publication Year
Publication
Publication Type
File Type

Articles 1 - 30 of 66

Full-Text Articles in Physical Sciences and Mathematics

Sentiment Analysis Of Public Perception Towards Elon Musk On Reddit (2008-2022), Daniel Maya Bonilla, Samuel Iradukunda, Pamela Thomas Sep 2023

Sentiment Analysis Of Public Perception Towards Elon Musk On Reddit (2008-2022), Daniel Maya Bonilla, Samuel Iradukunda, Pamela Thomas

The Cardinal Edge

As Elon Musk’s influence in technology and business continues to expand, it becomes crucial to comprehend public sentiment surrounding him in order to gauge the impact of his actions and statements. In this study, we conducted a comprehensive analysis of comments from various subreddits discussing Elon Musk over a 14-year period, from 2008 to 2022. Utilizing advanced sentiment analysis models and natural language processing techniques, we examined patterns and shifts in public sentiment towards Musk, identifying correlations with key events in his life and career. Our findings reveal that public sentiment is shaped by a multitude of factors, including his …


Digital Dna: The Ethical Implications Of Big Data As The World’S New-Age Commodity, Clark H. Dotson May 2023

Digital Dna: The Ethical Implications Of Big Data As The World’S New-Age Commodity, Clark H. Dotson

Honors Theses

In the emerging digital world that we find ourselves in, it becomes apparent that data collection has become a staple of daily life, whether we like it or not. This research discussion aims to bring light to just how much one’s own digital identity is valued in the technologically-infused world of today, with distinct research and local examples to bring awareness to the ethical implications of your online presence. The paper in question examines anecdotal and research evidence of the collection of data, both through true and unjust means, as well as ethical implications of what this information truly represents. …


Beyond News Values On Twitter: Predicting Factors That Drive User Engagement In News, Zhiyan Zhong Apr 2023

Beyond News Values On Twitter: Predicting Factors That Drive User Engagement In News, Zhiyan Zhong

Dartmouth College Master’s Theses

When deciding on what news stories to cover, traditional journalism determines news values by following several elements of newsworthiness, such as impact, timeliness, and prominence. However, these guidelines do not always seem to correspond with the success of content on social media. As people are increasingly turning to social media for news, our research aims to understand and predict factors that drive user engagement for news on social media. In this study, we analyze news content published on Twitter, and examine a diverse set of characteristics like metrics retrieved from the Twitter API and semantics by natural language processing, including …


Distinctive Features Of Nonverbal Behavior And Mimicry In Application Interviews Through Data Analysis And Machine Learning, Sanne Rogiers, Elias Corneillie, Filip Lievens, Frederik Anseel, Peter Veelaert, Wilfried Philips Sep 2022

Distinctive Features Of Nonverbal Behavior And Mimicry In Application Interviews Through Data Analysis And Machine Learning, Sanne Rogiers, Elias Corneillie, Filip Lievens, Frederik Anseel, Peter Veelaert, Wilfried Philips

Research Collection Lee Kong Chian School Of Business

This paper reveals the characteristics and effects of nonverbal behavior and human mimicry in the context of application interviews. It discloses a novel analyzation method for psychological research by utilizing machine learning. In comparison to traditional manual data analysis, machine learning proves to be able to analyze the data more deeply and to discover connections in the data invisible to the human eye. The paper describes an experiment to measure and analyze the reactions of evaluators to job applicants who adopt specific behaviors: mimicry, suppress, immediacy and natural behavior. First, evaluation of the applicant qualifications by the interviewer reveals …


State Prediction Of Poverty Alleviation Objects Based On Hmm And Multidimensional Data, Jun He, Sunyan Hong, Yifang Zhou, Shikai Shen, Muquan Zou May 2022

State Prediction Of Poverty Alleviation Objects Based On Hmm And Multidimensional Data, Jun He, Sunyan Hong, Yifang Zhou, Shikai Shen, Muquan Zou

Journal of System Simulation

Abstract: In order to solve the problems of inaccurate prediction of poverty, poverty reduction and poverty returen, and the difficulty in identifying the key factors affecting the state transition, 8 key features and 22 observed states are extracted from the poverty reduction basic data and multi-industry data. The relationship between observed state and implied state is constructed, and the hidden markov model (HMM) of poverty alleviation is established. Data of a deep poverty county for three consecutive years are used as samples for parameter training, test experiment and result verification. The results show that the method has a strong …


Big Data With Cloud Computing: Discussions And Challenges, Amanpreet Kaur Sandhu Mar 2022

Big Data With Cloud Computing: Discussions And Challenges, Amanpreet Kaur Sandhu

Big Data Mining and Analytics

With the recent advancements in computer technologies, the amount of data available is increasing day by day. However, excessive amounts of data create great challenges for users. Meanwhile, cloud computing services provide a powerful environment to store large volumes of data. They eliminate various requirements, such as dedicated space and maintenance of expensive computer hardware and software. Handling big data is a time-consuming task that requires large computational clusters to ensure successful data storage and processing. In this work, the definition, classification, and characteristics of big data are discussed, along with various cloud services, such as Microsoft Azure, Google Cloud, …


Ascp-Iomt: Ai-Enabled Lightweight Secure Communication Protocol For Internet Of Medical Things, Mohammad Wazid, Jaskaran Singh, Ashok Kumar Das, Sachin Shetty, Muhammad Khurram Khan, Joel J.P.C. Rodrigues Jan 2022

Ascp-Iomt: Ai-Enabled Lightweight Secure Communication Protocol For Internet Of Medical Things, Mohammad Wazid, Jaskaran Singh, Ashok Kumar Das, Sachin Shetty, Muhammad Khurram Khan, Joel J.P.C. Rodrigues

VMASC Publications

The Internet of Medical Things (IoMT) is a unification of smart healthcare devices, tools, and software, which connect various patients and other users to the healthcare information system through the networking technology. It further reduces unnecessary hospital visits and the burden on healthcare systems by connecting the patients to their healthcare experts (i.e., doctors) and allows secure transmission of healthcare data over an insecure channel (e.g., the Internet). Since Artificial Intelligence (AI) has a great impact on the performance and usability of an information system, it is important to include its modules in a healthcare information system, which will be …


Post-Quantum Secure Identity-Based Encryption Scheme Using Random Integer Lattices For Iot-Enabled Ai Applications, Dharminder Dharminder, Ashok Kumar Das, Sourav Saha, Basudeb Bera, Athanasios V. Vasilakos Jan 2022

Post-Quantum Secure Identity-Based Encryption Scheme Using Random Integer Lattices For Iot-Enabled Ai Applications, Dharminder Dharminder, Ashok Kumar Das, Sourav Saha, Basudeb Bera, Athanasios V. Vasilakos

VMASC Publications

Identity-based encryption is an important cryptographic system that is employed to ensure confidentiality of a message in communication. This article presents a provably secure identity based encryption based on post quantum security assumption. The security of the proposed encryption is based on the hard problem, namely Learning with Errors on integer lattices. This construction is anonymous and produces pseudo random ciphers. Both public-key size and ciphertext-size have been reduced in the proposed encryption as compared to those for other relevant schemes without compromising the security. Next, we incorporate the constructed identity based encryption (IBE) for Internet of Things (IoT) applications, …


The State Of The Art Of Information Integration In Space Applications, Zhuming Bi, K. L. Yung, Andrew W.H. Ip., Yuk Ming Tang, Chris W.J. Zhang, Li Da Xu Jan 2022

The State Of The Art Of Information Integration In Space Applications, Zhuming Bi, K. L. Yung, Andrew W.H. Ip., Yuk Ming Tang, Chris W.J. Zhang, Li Da Xu

Information Technology & Decision Sciences Faculty Publications

This paper aims to present a comprehensive survey on information integration (II) in space informatics. With an ever-increasing scale and dynamics of complex space systems, II has become essential in dealing with the complexity, changes, dynamics, and uncertainties of space systems. The applications of space II (SII) require addressing some distinctive functional requirements (FRs) of heterogeneity, networking, communication, security, latency, and resilience; while limited works are available to examine recent advances of SII thoroughly. This survey helps to gain the understanding of the state of the art of SII in sense that (1) technical drivers for SII are discussed and …


A Longitudinal Analysis Of Pathways To Computing Careers: Defining Broadening Participation In Computing (Bpc) Success With A Rearview Lens, Mercy Jaiyeola Dec 2021

A Longitudinal Analysis Of Pathways To Computing Careers: Defining Broadening Participation In Computing (Bpc) Success With A Rearview Lens, Mercy Jaiyeola

Theses and Dissertations

Efforts to increase the participation of groups historically underrepresented in computing studies, and in the computing workforce, are well documented. It is a national effort with funding from a variety of sources being allocated to research in broadening participation in computing (BPC). Many of the BPC efforts are funded by the National Science Foundation (NSF) but as existing literature shows, the growth in representation of traditionally underrepresented minorities and women is not commensurate to the efforts and resources that have been directed toward this aim.

Instead of attempting to tackle the barriers to increasing representation, this dissertation research tackles the …


Towards Machine Learning-Based Demand Response Forecasting Using Smart Grid Data, Matthew S. Johnson Aug 2021

Towards Machine Learning-Based Demand Response Forecasting Using Smart Grid Data, Matthew S. Johnson

Theses, Dissertations and Culminating Projects

Demand response is a valuable tool for improving the reliability, stability, and financial efficiency of smart grids. With the intention of altering customer power consumption patterns, utility companies often implement strategies such as time-of-use (TOU) programs. Although effective in some situations, TOU programs struggle to perform in highly developed countries due to the complexity of human behavior. In this study, we analyze power consumption readings from smart meters from 5567 households in London, UK from November 2011 to February 2014 to measure the success of the TOU program. We additionally consider the variability of weather conditions and customer demographics when …


A Machine Learning Approach To Understanding Emerging Markets, Namita Balani Jul 2021

A Machine Learning Approach To Understanding Emerging Markets, Namita Balani

Graduate Theses and Dissertations

Logistic providers have learned to efficiently serve their existing customer bases with optimized routes and transportation resource allocation. The problem arises when there is potential for logistics growth in an emerging market with no previous data. The purpose of this work is to use industry data for previously known and well-documented markets to apply data analytic techniques such as machine learning to investigate the uncertainty in a new market. The thesis looks into machine learning techniques to predict miles per stop given historical data. It mainly focuses on Random Forest Regression Analysis, but concludes that additional techniques, such as Polynomial …


Dismastd: An Efficient Distributed Multi-Aspect Streaming Tensor Decomposition, Keyu Yang, Yunjun Gao, Yifeng Shen, Baihua Zheng, Lu Chen Apr 2021

Dismastd: An Efficient Distributed Multi-Aspect Streaming Tensor Decomposition, Keyu Yang, Yunjun Gao, Yifeng Shen, Baihua Zheng, Lu Chen

Research Collection School Of Computing and Information Systems

Tensor decomposition is a fundamental multidimensional data analysis tool for many data-driven applications, such as social computing, computer vision, and bioinformatics, to name but a few. However, the rapidly increasing streaming data nowadays introduces new challenges to traditional static tensor decomposition. It requires an efficient distributed dynamic tensor decomposition without re-computing the whole tensor from scratch. In this paper, we propose DisMASTD, an efficient distributed multi-aspect streaming tensor decomposition. First, we prove the optimal tensor partitioning problem is NP-hard. Second, we present two heuristic tensor partitioning approaches to ensure the load balancing. Third, we develop a distributed multi-aspect streaming tensor …


Analysis Of Subtelomeric Rextal Assemblies Using Quast, Tunazzina Islam, Desh Ranjan, Mohammad Zubair, Eleanor Young, Ming Xiao, Harold Riethman Jan 2021

Analysis Of Subtelomeric Rextal Assemblies Using Quast, Tunazzina Islam, Desh Ranjan, Mohammad Zubair, Eleanor Young, Ming Xiao, Harold Riethman

Computer Science Faculty Publications

Genomic regions of high segmental duplication content and/or structural variation have led to gaps and misassemblies in the human reference sequence, and are refractory to assembly from whole-genome short-read datasets. Human subtelomere regions are highly enriched in both segmental duplication content and structural variations, and as a consequence are both impossible to assemble accurately and highly variable from individual to individual. Recently, we developed a pipeline for improved region-specific assembly called Regional Extension of Assemblies Using Linked-Reads (REXTAL). In this study, we evaluate REXTAL and genome-wide assembly (Supernova) approaches on 10X Genomics linked-reads data sets partitioned and barcoded using the …


A Deep Machine Learning Approach For Predicting Freeway Work Zone Delay Using Big Data, Abdullah Shabarek Dec 2020

A Deep Machine Learning Approach For Predicting Freeway Work Zone Delay Using Big Data, Abdullah Shabarek

Dissertations

The introduction of deep learning and big data analytics may significantly elevate the performance of traffic speed prediction. Work zones become one of the most critical factors causing congestion impact, which reduces the mobility as well as traffic safety. A comprehensive literature review on existing work zone delay prediction models (i.e., parametric, simulation and non-parametric models) is conducted in this research. The research shows the limitations of each model. Moreover, most previous modeling approaches did not consider user delay for connected freeways when predicting traffic speed under work zone conditions. This research proposes Deep Artificial Neural Network (Deep ANN) and …


Dublin Smart City Data Integration, Analysis And Visualisation, Hammad Ul Ahad Nov 2020

Dublin Smart City Data Integration, Analysis And Visualisation, Hammad Ul Ahad

Doctoral

Data is an important resource for any organisation, to understand the in-depth working and identifying the unseen trends with in the data. When this data is efficiently processed and analysed it helps the authorities to take appropriate decisions based on the derived insights and knowledge, through these decisions the service quality can be improved and enhance the customer experience. A massive growth in the data generation has been observed since two decades. The significant part of this generated data is generated from the dumb and smart sensors. If this raw data is processed in an efficient manner it could uplift …


Assessing Topical Homogeneity With Word Embedding And Distance Matrices, Jeffrey M. Stanton, Yisi Sang Oct 2020

Assessing Topical Homogeneity With Word Embedding And Distance Matrices, Jeffrey M. Stanton, Yisi Sang

School of Information Studies - Faculty Scholarship

Researchers from many fields have used statistical tools to make sense of large bodies of text. Many tools support quantitative analysis of documents within a corpus, but relatively few studies have examined statistical characteristics of whole corpora. Statistical summaries of whole corpora and comparisons between corpora have potential application in the analysis of topically organized applications such social media platforms. In this study, we created matrix representations of several corpora and examined several statistical tests to make comparisons between pairs of corpora with respect to the topical homogeneity of documents within each corpus. Results of three experiments suggested that a …


Email Data Breach Analysis And Prevention Using Hook And Eye System, Shubhankar Jayant Jathar Jul 2020

Email Data Breach Analysis And Prevention Using Hook And Eye System, Shubhankar Jayant Jathar

Electronic Theses, Projects, and Dissertations

Due to the recent COVID-19 outbreak, there were a lot of data leaks from the health sector. This project is about the increase in data breach incidents that are taking place. In this project, There is an analysis of different types of breaches that are found online and are practiced to steal valuable information. Talking about different aspects that lead to data breaches and which are the main sector or main epicenter for data leaks. The analysis tells that most of the data breaches are done using emails and to overcome this limitation a system has been designed that will …


Forecasting Of Short-Term Power Load Of Secrpso-Svm Based On Data-Driven, Hairong Sun, Bixia Xie, Tian Yao, Zhuoqun Li Jun 2020

Forecasting Of Short-Term Power Load Of Secrpso-Svm Based On Data-Driven, Hairong Sun, Bixia Xie, Tian Yao, Zhuoqun Li

Journal of System Simulation

Abstract: For the parameter selection of support vector machine in modeling, a particle swarm optimization algorithm based on second-order oscillation and repulsion factor was proposed to optimize the parameter of SVM. The algorithm employed the nonlinear decreasing weight to balance the global and local search ability. Second-order oscillation factor could maintain the population diversity. The repulsion factor was introduced to make the swarm even distribution in search space, which could avoid local optimum. For the complex characteristics of nonlinearity, time-varying and multifactorial of electric power load, a support vector machine forecasting model based on data was proposed, and the influence …


Randomized And Evolutionary Approaches To Dataset Characterization, Feature Weighting, And Sampling In K-Nearest Neighbors, Suryoday Basak May 2020

Randomized And Evolutionary Approaches To Dataset Characterization, Feature Weighting, And Sampling In K-Nearest Neighbors, Suryoday Basak

Computer Science and Engineering Theses

K-Nearest Neighbors (KNN) has remained one of the most popular methods for supervised machine learning tasks. However, its performance often depends on the characteristics of the dataset and on appropriate feature scaling. In this thesis, characteristics of a dataset that make it suitable for being used within KNN are explored. As part of this, two new measures for dataset dispersion, called mean neighborhood target variance (MNTV), and mean neighborhood target entropy (MNTE) are developed to help determine the performance we expect while using KNN regressors and classifiers, respectively. It is empirically demonstrated that these measures of dispersion can be indicative …


Securing The Emerging Technologies Of Autonomous And Connected Vehicles, Shahab Tayeb, Matin Pirouz Apr 2020

Securing The Emerging Technologies Of Autonomous And Connected Vehicles, Shahab Tayeb, Matin Pirouz

Mineta Transportation Institute

The Internet of Vehicles (IoV) aims to establish a network of autonomous and connected vehicles that communicate with one another through facilitation led by road-side units (RSUs) and a central trust authority (TA). Messages must be efficiently and securely disseminated to conserve resources and preserve network security. Currently, research in this area lacks consensus about security schemes and methods of disseminating messages. Furthermore, a current deficiency of information regarding resource optimization prevents further efficient development of this network. This paper takes an interdisciplinary approach to these issues by merging both cybersecurity and data science to optimize and secure the network. …


Algorithm Selection Framework: A Holistic Approach To The Algorithm Selection Problem, Marc W. Chalé Mar 2020

Algorithm Selection Framework: A Holistic Approach To The Algorithm Selection Problem, Marc W. Chalé

Theses and Dissertations

A holistic approach to the algorithm selection problem is presented. The “algorithm selection framework" uses a combination of user input and meta-data to streamline the algorithm selection for any data analysis task. The framework removes the conjecture of the common trial and error strategy and generates a preference ranked list of recommended analysis techniques. The framework is performed on nine analysis problems. Each of the recommended analysis techniques are implemented on the corresponding data sets. Algorithm performance is assessed using the primary metric of recall and the secondary metric of run time. In six of the problems, the recall of …


Empowering Qualitative Research Methods In Education With Artificial Intelligence, Luca Longo Jan 2020

Empowering Qualitative Research Methods In Education With Artificial Intelligence, Luca Longo

Conference papers

Artificial Intelligence is one of the fastest growing disciplines, disrupting many sectors. Originally mainly for computer scientists and engineers, it has been expanding its horizons and empowering many other disciplines contributing to the development of many novel applications in many sectors. These include medicine and health care, business and finance, psychology and neuroscience, physics and biology to mention a few. However, one of the disciplines in which artificial intelligence has not been fully explored and exploited yet is education. In this discipline, many research methods are employed by scholars, lecturers and practitioners to investigate the impact of different instructional approaches …


Social Media Based Algorithmic Clinical Decision Support Learning From Behavioral Predispositions, Radhika V. Medury Jan 2020

Social Media Based Algorithmic Clinical Decision Support Learning From Behavioral Predispositions, Radhika V. Medury

Doctoral Dissertations

Behavioral disorders are disabilities characterized by an individual’s mood, thinking, and social interactions. The commonality of behavioral disorders amongst the United States population has increased in the last few years, with an estimated 50% of all Americans diagnosed with a behavioral disorder at some point in their lifetime. AttentionDeficit/Hyperactivity Disorder is one such behavioral disorder that is a severe public health concern because of its high prevalence, incurable nature, significant impact on domestic life, and peer relationships. Symptomatically, in theory, ADHD is characterized by inattention, hyperactivity, and impulsivity. Access to providers who can offer diagnosis and treat the disorder varies …


Disaster Damage Categorization Applying Satellite Images And Machine Learning Algorithm, Farinaz Sabz Ali Pour, Adrian Gheorghe Jan 2020

Disaster Damage Categorization Applying Satellite Images And Machine Learning Algorithm, Farinaz Sabz Ali Pour, Adrian Gheorghe

Engineering Management & Systems Engineering Faculty Publications

Special information has a significant role in disaster management. Land cover mapping can detect short- and long-term changes and monitor the vulnerable habitats. It is an effective evaluation to be included in the disaster management system to protect the conservation areas. The critical visual and statistical information presented to the decision-makers can help in mitigation or adaption before crossing a threshold. This paper aims to contribute in the academic and the practice aspects by offering a potential solution to enhance the disaster data source effectiveness. The key research question that the authors try to answer in this paper is how …


Identifying Regional Trends In Avatar Customization, Peter Mawhorter, Sercan Sengun, Haewoon Kwak, D. Fox Harrell Dec 2019

Identifying Regional Trends In Avatar Customization, Peter Mawhorter, Sercan Sengun, Haewoon Kwak, D. Fox Harrell

Research Collection School Of Computing and Information Systems

Since virtual identities such as social media profiles and avatars have become a common venue for self-expression, it has become important to consider the ways in which existing systems embed the values of their designers. In order to design virtual identity systems that reflect the needs and preferences of diverse users, understanding how the virtual identity construction differs between groups is important. This paper presents a new methodology that leverages deep learning and differential clustering for comparative analysis of profile images, with a case study of almost 100 000 avatars from a large online community using a popular avatar creation …


Understanding Water Consumption And Energy Trends In New York City, Wen Yong Huang, Johann Thiel May 2019

Understanding Water Consumption And Energy Trends In New York City, Wen Yong Huang, Johann Thiel

Publications and Research

In this study, we will be using the NYC Open Data website to examine publicly available data sets on water and energy consumption in New York City. In particular, we will use various scientific programming and machine learning modules in Python to analyze and visualize trends in water and energy usage within the five boroughs.


Patterns In Color Perception, Madeline Henson, Taimur Iftikhar Apr 2019

Patterns In Color Perception, Madeline Henson, Taimur Iftikhar

Student Symposium

Synesthesia is a neurological condition that forces individuals to process a lot of different senses at once. These different senses can be stimulated by anything; for example, if one hears some sounds, they might also perceive those sounds as colors and vice versa. Another form of Synesthesia, termed Grapheme-Color Synesthesia, can occur when one looks at different characters in a language and they see different colors generated in their brain. The amount of colors a person sees by looking at different characters varies. Our goal for our project was to figure out how different languages stimulate different neurological senses for …


Csci 381/780 Data Analytics, Kumar Ramansenthil, Nyc Tech-In-Residence Corps Apr 2019

Csci 381/780 Data Analytics, Kumar Ramansenthil, Nyc Tech-In-Residence Corps

Open Educational Resources

No abstract provided.


Csc 21700 Probability And Statistics For Computer Science, Evan Agovino, Nyc Tech-In-Residence Corps Apr 2019

Csc 21700 Probability And Statistics For Computer Science, Evan Agovino, Nyc Tech-In-Residence Corps

Open Educational Resources

No abstract provided.