Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 14 of 14

Full-Text Articles in Physical Sciences and Mathematics

Sentiment Analysis Of Public Perception Towards Elon Musk On Reddit (2008-2022), Daniel Maya Bonilla, Samuel Iradukunda, Pamela Thomas Sep 2023

Sentiment Analysis Of Public Perception Towards Elon Musk On Reddit (2008-2022), Daniel Maya Bonilla, Samuel Iradukunda, Pamela Thomas

The Cardinal Edge

As Elon Musk’s influence in technology and business continues to expand, it becomes crucial to comprehend public sentiment surrounding him in order to gauge the impact of his actions and statements. In this study, we conducted a comprehensive analysis of comments from various subreddits discussing Elon Musk over a 14-year period, from 2008 to 2022. Utilizing advanced sentiment analysis models and natural language processing techniques, we examined patterns and shifts in public sentiment towards Musk, identifying correlations with key events in his life and career. Our findings reveal that public sentiment is shaped by a multitude of factors, including his …


Digital Dna: The Ethical Implications Of Big Data As The World’S New-Age Commodity, Clark H. Dotson May 2023

Digital Dna: The Ethical Implications Of Big Data As The World’S New-Age Commodity, Clark H. Dotson

Honors Theses

In the emerging digital world that we find ourselves in, it becomes apparent that data collection has become a staple of daily life, whether we like it or not. This research discussion aims to bring light to just how much one’s own digital identity is valued in the technologically-infused world of today, with distinct research and local examples to bring awareness to the ethical implications of your online presence. The paper in question examines anecdotal and research evidence of the collection of data, both through true and unjust means, as well as ethical implications of what this information truly represents. …


Distinctive Features Of Nonverbal Behavior And Mimicry In Application Interviews Through Data Analysis And Machine Learning, Sanne Rogiers, Elias Corneillie, Filip Lievens, Frederik Anseel, Peter Veelaert, Wilfried Philips Sep 2022

Distinctive Features Of Nonverbal Behavior And Mimicry In Application Interviews Through Data Analysis And Machine Learning, Sanne Rogiers, Elias Corneillie, Filip Lievens, Frederik Anseel, Peter Veelaert, Wilfried Philips

Research Collection Lee Kong Chian School Of Business

This paper reveals the characteristics and effects of nonverbal behavior and human mimicry in the context of application interviews. It discloses a novel analyzation method for psychological research by utilizing machine learning. In comparison to traditional manual data analysis, machine learning proves to be able to analyze the data more deeply and to discover connections in the data invisible to the human eye. The paper describes an experiment to measure and analyze the reactions of evaluators to job applicants who adopt specific behaviors: mimicry, suppress, immediacy and natural behavior. First, evaluation of the applicant qualifications by the interviewer reveals …


The State Of The Art Of Information Integration In Space Applications, Zhuming Bi, K. L. Yung, Andrew W.H. Ip., Yuk Ming Tang, Chris W.J. Zhang, Li Da Xu Jan 2022

The State Of The Art Of Information Integration In Space Applications, Zhuming Bi, K. L. Yung, Andrew W.H. Ip., Yuk Ming Tang, Chris W.J. Zhang, Li Da Xu

Information Technology & Decision Sciences Faculty Publications

This paper aims to present a comprehensive survey on information integration (II) in space informatics. With an ever-increasing scale and dynamics of complex space systems, II has become essential in dealing with the complexity, changes, dynamics, and uncertainties of space systems. The applications of space II (SII) require addressing some distinctive functional requirements (FRs) of heterogeneity, networking, communication, security, latency, and resilience; while limited works are available to examine recent advances of SII thoroughly. This survey helps to gain the understanding of the state of the art of SII in sense that (1) technical drivers for SII are discussed and …


A Machine Learning Approach To Understanding Emerging Markets, Namita Balani Jul 2021

A Machine Learning Approach To Understanding Emerging Markets, Namita Balani

Graduate Theses and Dissertations

Logistic providers have learned to efficiently serve their existing customer bases with optimized routes and transportation resource allocation. The problem arises when there is potential for logistics growth in an emerging market with no previous data. The purpose of this work is to use industry data for previously known and well-documented markets to apply data analytic techniques such as machine learning to investigate the uncertainty in a new market. The thesis looks into machine learning techniques to predict miles per stop given historical data. It mainly focuses on Random Forest Regression Analysis, but concludes that additional techniques, such as Polynomial …


Dismastd: An Efficient Distributed Multi-Aspect Streaming Tensor Decomposition, Keyu Yang, Yunjun Gao, Yifeng Shen, Baihua Zheng, Lu Chen Apr 2021

Dismastd: An Efficient Distributed Multi-Aspect Streaming Tensor Decomposition, Keyu Yang, Yunjun Gao, Yifeng Shen, Baihua Zheng, Lu Chen

Research Collection School Of Computing and Information Systems

Tensor decomposition is a fundamental multidimensional data analysis tool for many data-driven applications, such as social computing, computer vision, and bioinformatics, to name but a few. However, the rapidly increasing streaming data nowadays introduces new challenges to traditional static tensor decomposition. It requires an efficient distributed dynamic tensor decomposition without re-computing the whole tensor from scratch. In this paper, we propose DisMASTD, an efficient distributed multi-aspect streaming tensor decomposition. First, we prove the optimal tensor partitioning problem is NP-hard. Second, we present two heuristic tensor partitioning approaches to ensure the load balancing. Third, we develop a distributed multi-aspect streaming tensor …


Assessing Topical Homogeneity With Word Embedding And Distance Matrices, Jeffrey M. Stanton, Yisi Sang Oct 2020

Assessing Topical Homogeneity With Word Embedding And Distance Matrices, Jeffrey M. Stanton, Yisi Sang

School of Information Studies - Faculty Scholarship

Researchers from many fields have used statistical tools to make sense of large bodies of text. Many tools support quantitative analysis of documents within a corpus, but relatively few studies have examined statistical characteristics of whole corpora. Statistical summaries of whole corpora and comparisons between corpora have potential application in the analysis of topically organized applications such social media platforms. In this study, we created matrix representations of several corpora and examined several statistical tests to make comparisons between pairs of corpora with respect to the topical homogeneity of documents within each corpus. Results of three experiments suggested that a …


Identifying Regional Trends In Avatar Customization, Peter Mawhorter, Sercan Sengun, Haewoon Kwak, D. Fox Harrell Dec 2019

Identifying Regional Trends In Avatar Customization, Peter Mawhorter, Sercan Sengun, Haewoon Kwak, D. Fox Harrell

Research Collection School Of Computing and Information Systems

Since virtual identities such as social media profiles and avatars have become a common venue for self-expression, it has become important to consider the ways in which existing systems embed the values of their designers. In order to design virtual identity systems that reflect the needs and preferences of diverse users, understanding how the virtual identity construction differs between groups is important. This paper presents a new methodology that leverages deep learning and differential clustering for comparative analysis of profile images, with a case study of almost 100 000 avatars from a large online community using a popular avatar creation …


Smartphone Sensing Meets Transport Data: A Collaborative Framework For Transportation Service Analytics, Yu Lu, Archan Misra, Wen Sun, Huayu Wu Aug 2017

Smartphone Sensing Meets Transport Data: A Collaborative Framework For Transportation Service Analytics, Yu Lu, Archan Misra, Wen Sun, Huayu Wu

Research Collection School Of Computing and Information Systems

We advocate for and introduce TRANSense, a framework for urban transportation service analytics that combines participatory smartphone sensing data with city-scale transportation-related transactional data (taxis, trains etc.). Our work is driven by the observed limitations of using each data type in isolation: (a) commonly-used anonymous city-scale datasets (such as taxi bookings and GPS trajectories) provide insights into the aggregate behavior of transport infrastructure, but fail to reveal individual-specific transport experiences (e.g., wait times in taxi queues); while (b) mobile sensing data can capture individual-specific commuting-related activities, but suffers from accuracy and energy overhead challenges due to usage artefacts and lack …


Harnessing The Power Of Text Mining For The Detection Of Abusive Content In Social Media, Hao Chen, Susan Mckeever, Sarah Jane Delany Jan 2016

Harnessing The Power Of Text Mining For The Detection Of Abusive Content In Social Media, Hao Chen, Susan Mckeever, Sarah Jane Delany

Conference papers

Abstract The issues of cyberbullying and online harassment have gained considerable coverage in the last number of years. Social media providers need to be able to detect abusive content both accurately and efficiently in order to protect their users. Our aim is to investigate the application of core text mining techniques for the automatic detection of abusive content across a range of social media sources include blogs, forums, media-sharing, Q&A and chat - using datasets from Twitter, YouTube, MySpace, Kongregate, Formspring and Slashdot. Using supervised machine learning, we compare alternative text representations and dimension reduction approaches, including feature selection and …


An Approach To Nearest Neighboring Search For Multi-Dimensional Data, Yong Shi, Li Zhang, Lei Zhu Mar 2011

An Approach To Nearest Neighboring Search For Multi-Dimensional Data, Yong Shi, Li Zhang, Lei Zhu

Faculty Articles

Finding nearest neighbors in large multi-dimensional data has always been one of the research interests in data mining field. In this paper, we present our continuous research on similarity search problems. Previously we have worked on exploring the meaning of K nearest neighbors from a new perspective in PanKNN [20]. It redefines the distances between data points and a given query point Q, efficiently and effectively selecting data points which are closest to Q. It can be applied in various data mining fields. A large amount of real data sets have irrelevant or obstacle information which greatly affects the effectiveness …


Sgpm: Static Group Pattern Mining Using Apriori-Like Sliding Window, John Goh, David Taniar, Ee Peng Lim Apr 2006

Sgpm: Static Group Pattern Mining Using Apriori-Like Sliding Window, John Goh, David Taniar, Ee Peng Lim

Research Collection School Of Computing and Information Systems

Mobile user data mining is a field that focuses on extracting interesting pattern and knowledge out from data generated by mobile users. Group pattern is a type of mobile user data mining method. In group pattern mining, group patterns from a given user movement database is found based on spatio-temporal distances. In this paper, we propose an improvement of efficiency using area method for locating mobile users and using sliding window for static group pattern mining. This reduces the complexity of valid group pattern mining problem. We support the use of static method, which uses areas and sliding windows instead …


Fisa: Feature-Based Instance Selection For Imbalanced Text Classification, Aixin Sun, Ee Peng Lim, Boualem Benatallah, Mahbub Hassan Apr 2006

Fisa: Feature-Based Instance Selection For Imbalanced Text Classification, Aixin Sun, Ee Peng Lim, Boualem Benatallah, Mahbub Hassan

Research Collection School Of Computing and Information Systems

Support Vector Machines (SVM) classifiers are widely used in text classification tasks and these tasks often involve imbalanced training. In this paper, we specifically address the cases where negative training documents significantly outnumber the positive ones. A generic algorithm known as FISA (Feature-based Instance Selection Algorithm), is proposed to select only a subset of negative training documents for training a SVM classifier. With a smaller carefully selected training set, a SVM classifier can be more efficiently trained while delivering comparable or better classification accuracy. In our experiments on the 20-Newsgroups dataset, using only 35% negative training examples and 60% learning …


Keynote: The Use Of Meta-Heuristic Algorithms For Data Mining, Dr. Beatrize De La Iglesia, A. Reynolds Aug 2005

Keynote: The Use Of Meta-Heuristic Algorithms For Data Mining, Dr. Beatrize De La Iglesia, A. Reynolds

International Conference on Information and Communication Technologies

In this paper we explore the application of powerful optimisers known as metaheuristic algorithms to problems within the data mining domain. We introduce some well-known data mining problems, and show how they can be formulated as optimisation problems. We then review the use of metaheuristics in this context. In particular, we focus on the task of partial classification and show how multi-objective metaheuristics have produced results that are comparable to the best known techniques but more scalable to large databases. We conclude by reinforcing the importance of research on the areas of metaheuristics for optimisation and data mining. The combination …