Open Access. Powered by Scholars. Published by Universities.®
- Discipline
-
- Computer Sciences (8)
- Physical Sciences and Mathematics (8)
- Other Computer Engineering (7)
- Data Storage Systems (5)
- Civil and Environmental Engineering (3)
-
- Digital Communications and Networking (3)
- Numerical Analysis and Scientific Computing (3)
- Transportation Engineering (3)
- Civil Engineering (2)
- Computer and Systems Architecture (2)
- Electrical and Computer Engineering (2)
- Artificial Intelligence and Robotics (1)
- Chemical Engineering (1)
- Computational Engineering (1)
- Construction Engineering and Management (1)
- Data Science (1)
- Electrical and Electronics (1)
- Geographic Information Sciences (1)
- Geography (1)
- Mathematics (1)
- Other Computer Sciences (1)
- Other Electrical and Computer Engineering (1)
- Social and Behavioral Sciences (1)
- Institution
-
- University of Louisville (5)
- Louisiana Tech University (2)
- New Jersey Institute of Technology (2)
- Old Dominion University (2)
- Air Force Institute of Technology (1)
-
- California Polytechnic State University, San Luis Obispo (1)
- California State University, San Bernardino (1)
- Clemson University (1)
- Louisiana State University (1)
- Singapore Management University (1)
- Syracuse University (1)
- University of New Mexico (1)
- University of Tennessee at Chattanooga (1)
- University of Tennessee, Knoxville (1)
- Western Michigan University (1)
- Western University (1)
- Publication Year
- Publication
-
- Electronic Theses and Dissertations (5)
- Dissertations (3)
- Computational Modeling & Simulation Engineering Theses & Dissertations (2)
- Doctoral Dissertations (2)
- All Dissertations (1)
-
- Computer Science ETDs (1)
- Dissertations and Theses Collection (Open Access) (1)
- Electrical Engineering and Computer Science - Dissertations (1)
- Electronic Thesis and Dissertation Repository (1)
- LSU Doctoral Dissertations (1)
- Master's Theses (1)
- Masters Theses (1)
- Masters Theses and Doctoral Dissertations (1)
- Theses Digitization Project (1)
- Theses and Dissertations (1)
Articles 1 - 23 of 23
Full-Text Articles in Computer Engineering
Digitalization Of Construction Project Requirements Using Natural Language Processing (Nlp) Techniques, Fahad Ul Hassan
Digitalization Of Construction Project Requirements Using Natural Language Processing (Nlp) Techniques, Fahad Ul Hassan
All Dissertations
Contract documents are a critical legal component of a construction project that specify all wishes and expectations of the owner toward the design, construction, and handover of a project. A single contract package, especially of a design-build (DB) project, comprises hundreds of documents including thousands of requirements. Precise comprehension and management of the requirements are critical to ensure that all important explicit and implicit requirements of the project scope are captured, managed, and completed. Since requirements are mainly written in a natural human language, the current manual methods impose a significant burden on practitioners to process and restructure them into …
Data-Driven Operational And Safety Analysis Of Emerging Shared Electric Scooter Systems, Qingyu Ma
Data-Driven Operational And Safety Analysis Of Emerging Shared Electric Scooter Systems, Qingyu Ma
Computational Modeling & Simulation Engineering Theses & Dissertations
The rapid rise of shared electric scooter (E-Scooter) systems offers many urban areas a new micro-mobility solution. The portable and flexible characteristics have made E-Scooters a competitive mode for short-distance trips. Compared to other modes such as bikes, E-Scooters allow riders to freely ride on different facilities such as streets, sidewalks, and bike lanes. However, sharing lanes with vehicles and other users tends to cause safety issues for riding E-Scooters. Conventional methods are often not applicable for analyzing such safety issues because well-archived historical crash records are not commonly available for emerging E-Scooters.
Perceiving the growth of such a micro-mobility …
Data Mining Of Unstructured Textual Information In Transportation Safety Domain: Exploring Methods, Opportunities And Limitations, Keneth Morgan Kwayu
Data Mining Of Unstructured Textual Information In Transportation Safety Domain: Exploring Methods, Opportunities And Limitations, Keneth Morgan Kwayu
Dissertations
The unprecedented increase in volume and influx of structured and unstructured data has overwhelmed conventional data management system capabilities in organizing, analyzing, and procuring useful information in a timely fashion. Structured data sources have a pre-defined pattern that makes data preprocessing and information retrieval tasks relatively easy for the current technologies that have been designed to handle structured and repeatable data. Unlike structured data, unstructured data usually exists in an unorganized format that offers no or little insight unless indexed and stored in an organized fashion. The inherent format of unstructured data exacerbates difficulties in data preprocessing and information extraction. …
Hybrid Deep Neural Networks For Mining Heterogeneous Data, Xiurui Hou
Hybrid Deep Neural Networks For Mining Heterogeneous Data, Xiurui Hou
Dissertations
In the era of big data, the rapidly growing flood of data represents an immense opportunity. New computational methods are desired to fully leverage the potential that exists within massive structured and unstructured data. However, decision-makers are often confronted with multiple diverse heterogeneous data sources. The heterogeneity includes different data types, different granularities, and different dimensions, posing a fundamental challenge in many applications. This dissertation focuses on designing hybrid deep neural networks for modeling various kinds of data heterogeneity.
The first part of this dissertation concerns modeling diverse data types, the first kind of data heterogeneity. Specifically, image data and …
On I/O Performance And Cost Efficiency Of Cloud Storage: A Client's Perspective, Binbing Hou
On I/O Performance And Cost Efficiency Of Cloud Storage: A Client's Perspective, Binbing Hou
LSU Doctoral Dissertations
Cloud storage has gained increasing popularity in the past few years. In cloud storage, data are stored in the service provider’s data centers; users access data via the network and pay the fees based on the service usage. For such a new storage model, our prior wisdom and optimization schemes on conventional storage may not remain valid nor applicable to the emerging cloud storage.
In this dissertation, we focus on understanding and optimizing the I/O performance and cost efficiency of cloud storage from a client’s perspective. We first conduct a comprehensive study to gain insight into the I/O performance behaviors …
Feature Space Modeling For Accurate And Efficient Learning From Non-Stationary Data, Ayesha Akter
Feature Space Modeling For Accurate And Efficient Learning From Non-Stationary Data, Ayesha Akter
Doctoral Dissertations
A non-stationary dataset is one whose statistical properties such as the mean, variance, correlation, probability distribution, etc. change over a specific interval of time. On the contrary, a stationary dataset is one whose statistical properties remain constant over time. Apart from the volatile statistical properties, non-stationary data poses other challenges such as time and memory management due to the limitation of computational resources mostly caused by the recent advancements in data collection technologies which generate a variety of data at an alarming pace and volume. Additionally, when the collected data is complex, managing data complexity, emerging from its dimensionality and …
Analyzing Evolution Of Rare Events Through Social Media Data, Xiaoyu Lu
Analyzing Evolution Of Rare Events Through Social Media Data, Xiaoyu Lu
Dissertations
Recently, some researchers have attempted to find a relationship between the evolution of rare events and temporal-spatial patterns of social media activities. Their studies verify that the relationship exists in both time and spatial domains. However, few of those studies can accurately deduce a time point when social media activities are most highly affected by a rare event because producing an accurate temporal pattern of social media during the evolution of a rare event is very difficult. This work expands the current studies along three directions. Firstly, we focus on the intensity of information volume and propose an innovative clustering …
Towards Efficient Intrusion Detection Using Hybrid Data Mining Techniques, Fadi Salo
Towards Efficient Intrusion Detection Using Hybrid Data Mining Techniques, Fadi Salo
Electronic Thesis and Dissertation Repository
The enormous development in the connectivity among different type of networks poses significant concerns in terms of privacy and security. As such, the exponential expansion in the deployment of cloud technology has produced a massive amount of data from a variety of applications, resources and platforms. In turn, the rapid rate and volume of data creation in high-dimension has begun to pose significant challenges for data management and security. Handling redundant and irrelevant features in high-dimensional space has caused a long-term challenge for network anomaly detection. Eliminating such features with spectral information not only speeds up the classification process, but …
Analyzing And Modeling Users In Multiple Online Social Platforms, Roy Lee Ka Wei
Analyzing And Modeling Users In Multiple Online Social Platforms, Roy Lee Ka Wei
Dissertations and Theses Collection (Open Access)
This dissertation addresses the empirical analysis on user-generated data from multiple online social platforms (OSPs) and modeling of latent user factors in multiple OSPs setting.
In the first part of this dissertation, we conducted cross-platform empirical studies to better understand user's social and work activities in multiple OSPs. In particular, we proposed new methodologies to analyze users' friendship maintenance and collaborative activities in multiple OSPs. We also apply the proposed methodologies on real-world OSP datasets, and the findings from our empirical studies have provided us with a better understanding on users' social and work activities which are previously not uncovered …
Horse Racing Prediction Using Graph-Based Features., Mehmet Akif Gulum
Horse Racing Prediction Using Graph-Based Features., Mehmet Akif Gulum
Electronic Theses and Dissertations
This thesis presents an applied horse racing prediction using graph based features on a set of horse races data. We used artificial neural network and logistic regression models to train then test to prediction without graph based features and with graph based features. This thesis can be explained in 4 main parts. Collect data from a horse racing website held from 2015 to 2017. Train data to using predictive models and make a prediction. Create a global directed graph of horses and extract graph-based features (Core Part) . Add graph based features to basic features and train to using same …
Maintainability Analysis Of Mining Trucks With Data Analytics., Abdulgani Kahraman
Maintainability Analysis Of Mining Trucks With Data Analytics., Abdulgani Kahraman
Electronic Theses and Dissertations
The mining industry is one of the biggest industries in need of a large budget, and current changes in global economic challenges force the industry to reduce its production expenses. One of the biggest expenditures is maintenance. Thanks to the data mining techniques, available historical records of machines’ alarms and signals might be used to predict machine failures. This is crucial because repairing machines after failures is not as efficient as utilizing predictive maintenance. In this case study, the reasons for failures seem to be related to the order of signals or alarms, called events, which come from trucks. The …
Distributed Knowledge Discovery For Diverse Data, Hossein Hamooni
Distributed Knowledge Discovery For Diverse Data, Hossein Hamooni
Computer Science ETDs
In the era of new technologies, computer scientists deal with massive data of size hundreds of terabytes. Smart cities, social networks, health care systems, large sensor networks, etc. are constantly generating new data. It is non-trivial to extract knowledge from big datasets because traditional data mining algorithms run impractically on such big datasets. However, distributed systems have come to aid this problem while introducing new challenges in designing scalable algorithms. The transition from traditional algorithms to the ones that can be run on a distributed platform should be done carefully. Researchers should design the modern distributed algorithms based on the …
Peeking Into The Other Half Of The Glass : Handling Polarization In Recommender Systems., Mahsa Badami
Peeking Into The Other Half Of The Glass : Handling Polarization In Recommender Systems., Mahsa Badami
Electronic Theses and Dissertations
This dissertation is about filtering and discovering information online while using recommender systems. In the first part of our research, we study the phenomenon of polarization and its impact on filtering and discovering information. Polarization is a social phenomenon, with serious consequences, in real-life, particularly on social media. Thus it is important to understand how machine learning algorithms, especially recommender systems, behave in polarized environments. We study polarization within the context of the users' interactions with a space of items and how this affects recommender systems. We first formalize the concept of polarization based on item ratings and then relate …
Network Analysis With Stochastic Grammars, Alan C. Lin
Network Analysis With Stochastic Grammars, Alan C. Lin
Theses and Dissertations
Digital forensics requires significant manual effort to identify items of evidentiary interest from the ever-increasing volume of data in modern computing systems. One of the tasks digital forensic examiners conduct is mentally extracting and constructing insights from unstructured sequences of events. This research assists examiners with the association and individualization analysis processes that make up this task with the development of a Stochastic Context -Free Grammars (SCFG) knowledge representation for digital forensics analysis of computer network traffic. SCFG is leveraged to provide context to the low-level data collected as evidence and to build behavior profiles. Upon discovering patterns, the analyst …
Text Stylometry For Chat Bot Identification And Intelligence Estimation., Nawaf Ali
Text Stylometry For Chat Bot Identification And Intelligence Estimation., Nawaf Ali
Electronic Theses and Dissertations
Authorship identification is a technique used to identify the author of an unclaimed document, by attempting to find traits that will match those of the original author. Authorship identification has a great potential for applications in forensics. It can also be used in identifying chat bots, a form of intelligent software created to mimic the human conversations, by their unique style. The online criminal community is utilizing chat bots as a new way to steal private information and commit fraud and identity theft. The need for identifying chat bots by their style is becoming essential to overcome the danger of …
Rank Based Anomaly Detection Algorithms, Huaming Huang
Rank Based Anomaly Detection Algorithms, Huaming Huang
Electrical Engineering and Computer Science - Dissertations
Anomaly or outlier detection problems are of considerable importance, arising frequently in diverse real-world applications such as finance and cyber-security. Several algorithms have been formulated for such problems, usually based on formulating a problem-dependent heuristic or distance metric. This dissertation proposes anomaly detection algorithms that exploit the notion of ``rank," expressing relative outlierness of different points in the relevant space, and exploiting asymmetry in nearest neighbor relations between points: a data point is ``more anomalous" if it is not the nearest neighbor of its nearest neighbors. Although rank is computed using distance, it is a more robust and higher level …
An Efficient Algorithm To Solve High-Dimensional Data Clustering: Candidate Subspace Clustering Algorithm, Chin-Chieh Kao
An Efficient Algorithm To Solve High-Dimensional Data Clustering: Candidate Subspace Clustering Algorithm, Chin-Chieh Kao
Theses Digitization Project
For this project, a comprehensive literature review on high dimensional data clustering is conducted and a novel density-algorithm to perform high dimensional data clustering is developed.
Semi-Automatic Simulation Initialization By Mining Structured And Unstructured Data Formats From Local And Web Data Sources, Olcay Sahin
Computational Modeling & Simulation Engineering Theses & Dissertations
Initialization is one of the most important processes for obtaining successful results from a simulation. However, initialization is a challenge when 1) a simulation requires hundreds or even thousands of input parameters or 2) re-initializing the simulation due to different initial conditions or runtime errors. These challenges lead to the modeler spending more time initializing a simulation and may lead to errors due to poor input data.
This thesis proposes two semi-automatic simulation initialization approaches that provide initialization using data mining from structured and unstructured data formats from local and web data sources. First, the System Initialization with Retrieval (SIR) …
Measuring Merci: Exploring Data Mining Techniques For Examining Surgical Outcomes Of Stroke Patients, Matthew Ronald Mcnabb
Measuring Merci: Exploring Data Mining Techniques For Examining Surgical Outcomes Of Stroke Patients, Matthew Ronald Mcnabb
Masters Theses and Doctoral Dissertations
Mechanical Embolus Removal in Cerebral Ischemia (MERCI) has been supported by medical trials as an improved method of treating ischemic stroke past the safe window of time for administering clot-busting drugs, and was released for medical use in 2004. The importance of analyzing real-world data collected from MERCI clinical trials is key to providing insights on the effectiveness of MERCI. Most of the existing data analysis on MERCI results has thus far employed conventional statistical analysis techniques. To the best of the knowledge acquired in preliminary research, advanced data analytics and data mining techniques have not yet been systematically applied. …
An Interactive Visualization Model For Analyzing Data Storage System Workloads, Steven Charubhat Pungdumri
An Interactive Visualization Model For Analyzing Data Storage System Workloads, Steven Charubhat Pungdumri
Master's Theses
The performance of hard disks has become increasingly important as the volume of data storage increases. At the bottom level of large-scale storage networks is the hard disk. Despite the importance of hard drives in a storage network, it is often difficult to analyze the performance of hard disks due to the sheer size of the datasets seen by hard disks. Additionally, hard drive workloads can have several multi-dimensional characteristics, such as access time, queue depth and block-address space. The result is that hard drive workloads are extremely diverse and large, making extracting meaningful information from hard drive workloads very …
Data Mining Based Learning Algorithms For Semi-Supervised Object Identification And Tracking, Michael P. Dessauer
Data Mining Based Learning Algorithms For Semi-Supervised Object Identification And Tracking, Michael P. Dessauer
Doctoral Dissertations
Sensor exploitation (SE) is the crucial step in surveillance applications such as airport security and search and rescue operations. It allows localization and identification of movement in urban settings and can significantly boost knowledge gathering, interpretation and action. Data mining techniques offer the promise of precise and accurate knowledge acquisition techniques in high-dimensional data domains (and diminishing the “curse of dimensionality” prevalent in such datasets), coupled by algorithmic design in feature extraction, discriminative ranking, feature fusion and supervised learning (classification). Consequently, data mining techniques and algorithms can be used to refine and process captured data and to detect, recognize, classify, …
Dynamic Application Level Security Sensors, Christopher Thomas Rathgeb
Dynamic Application Level Security Sensors, Christopher Thomas Rathgeb
Masters Theses
The battle for cyber supremacy is a cat and mouse game: evolving threats from internal and external sources make it difficult to protect critical systems. With the diverse and high risk nature of these threats, there is a need for robust techniques that can quickly adapt and address this evolution. Existing tools such as Splunk, Snort, and Bro help IT administrators defend their networks by actively parsing through network traffic or system log data. These tools have been thoroughly developed and have proven to be a formidable defense against many cyberattacks. However, they are vulnerable to zero-day attacks, slow attacks, …
Multivariate Discretization Of Continuous Valued Attributes., Ehab Ahmed El Sayed Ahmed 1978-
Multivariate Discretization Of Continuous Valued Attributes., Ehab Ahmed El Sayed Ahmed 1978-
Electronic Theses and Dissertations
The area of Knowledge discovery and data mining is growing rapidly. Feature Discretization is a crucial issue in Knowledge Discovery in Databases (KDD), or Data Mining because most data sets used in real world applications have features with continuously values. Discretization is performed as a preprocessing step of the data mining to make data mining techniques useful for these data sets. This thesis addresses discretization issue by proposing a multivariate discretization (MVD) algorithm. It begins withal number of common discretization algorithms like Equal width discretization, Equal frequency discretization, Naïve; Entropy based discretization, Chi square discretization, and orthogonal hyper planes. After …