Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

41,036 Full-Text Articles 45,816 Authors 15,325,331 Downloads 332 Institutions

All Articles in Computer Sciences

Faceted Search

41,036 full-text articles. Page 1 of 1328.

Exposing And Fixing Causes Of Inconsistency And Nondeterminism In Clustering Implementations, Xin Yin 2021 New Jersey Institute of Technology

Exposing And Fixing Causes Of Inconsistency And Nondeterminism In Clustering Implementations, Xin Yin

Dissertations

Cluster analysis aka Clustering is used in myriad applications, including high-stakes domains, by millions of users. Clustering users should be able to assume that clustering implementations are correct, reliable, and for a given algorithm, interchangeable. Based on observations in a wide-range of real-world clustering implementations, this dissertation challenges the aforementioned assumptions.This dissertation introduces an approach named SmokeOut that uses differential clustering to show that clustering implementations suffer from nondeterminism and inconsistency: on a given input dataset and using a given clustering algorithm, clustering outcomes and accuracy vary widely between (1) successive runs of the same toolkit, i.e., nondeterminism ...


Enterprise Environment Modeling For Penetration Testing On The Openstack Virtualization Platform, Vincent Karovič Jr., Jakub Bartaloš, Vincent Karovič, Michal Greguš 2021 Comenius University

Enterprise Environment Modeling For Penetration Testing On The Openstack Virtualization Platform, Vincent Karovič Jr., Jakub Bartaloš, Vincent Karovič, Michal Greguš

Journal of Global Business Insights

The article presents the design of a model environment for penetration testing of an organization using virtualization. The need for this model was based on the constantly increasing requirements for the security of information systems, both in legal terms and in accordance with international security standards. The model was created based on a specific team from the unnamed company. The virtual working environment offered the same functions as the physical environment. The virtual working environment was created in OpenStack and tested with a Linux distribution Kali Linux. We demonstrated that the virtual environment is functional and its security testable. Virtualizing ...


Modelling Customers Credit Card Behaviour Using Bidirectional Lstm Neural Networks, Maher Ala’raj, Maysam F. Abbod, Munir Majdalawieh 2021 Zayed University

Modelling Customers Credit Card Behaviour Using Bidirectional Lstm Neural Networks, Maher Ala’Raj, Maysam F. Abbod, Munir Majdalawieh

All Works

With the rapid growth of consumer credit and the huge amount of financial data developing effective credit scoring models is very crucial. Researchers have developed complex credit scoring models using statistical and artificial intelligence (AI) techniques to help banks and financial institutions to support their financial decisions. Neural networks are considered as a mostly wide used technique in finance and business applications. Thus, the main aim of this paper is to help bank management in scoring credit card clients using machine learning by modelling and predicting the consumer behaviour with respect to two aspects: the probability of single and consecutive ...


Qlens: Visual Analytics Of Multi-Step Problem-Solving Behaviors For Improving Question Design, Meng XIA, Reshika P. VELUMANI, Yong WANG, Huamin QU, Xiaojuan MA 2021 Singapore Management University

Qlens: Visual Analytics Of Multi-Step Problem-Solving Behaviors For Improving Question Design, Meng Xia, Reshika P. Velumani, Yong Wang, Huamin Qu, Xiaojuan Ma

Research Collection School Of Computing and Information Systems

With the rapid development of online education in recent years, there has been an increasing number of learning platforms that provide students with multi-step questions to cultivate their problem-solving skills. To guarantee the high quality of such learning materials, question designers need to inspect how students’ problem-solving processes unfold step by step to infer whether students’ problem-solving logic matches their design intent. They also need to compare the behaviors of different groups (e.g., students from different grades) to distribute questions to students with the right level of knowledge. The availability of fine-grained interaction data, such as mouse movement trajectories ...


Taxthemis: Interactive Mining And Exploration Of Suspicious Tax Evasion Group, Yating LIN, Kamkwai WONG, Yong WANG, Rong ZHANG, Bo DONG, Huamin QU, Qinghua ZHENG 2021 Singapore Management University

Taxthemis: Interactive Mining And Exploration Of Suspicious Tax Evasion Group, Yating Lin, Kamkwai Wong, Yong Wang, Rong Zhang, Bo Dong, Huamin Qu, Qinghua Zheng

Research Collection School Of Computing and Information Systems

Tax evasion is a serious economic problem for many countries, as it can undermine the government’s tax system and lead to an unfair business competition environment. Recent research has applied data analytics techniques to analyze and detect tax evasion behaviors of individual taxpayers. However, they have failed to support the analysis and exploration of the related party transaction tax evasion (RPTTE) behaviors (e.g., transfer pricing), where a group of taxpayers is involved. In this paper, we present TaxThemis, an interactive visual analytics system to help tax officers mine and explore suspicious tax evasion groups through analyzing heterogeneous tax-related ...


Visual Analysis Of Discrimination In Machine Learning, Qianwen WANG, Zhenghua XU, Zhutian CHEN, Yong WANG, Yong WANG, Huamin Qu 2021 Singapore Management University

Visual Analysis Of Discrimination In Machine Learning, Qianwen Wang, Zhenghua Xu, Zhutian Chen, Yong Wang, Yong Wang, Huamin Qu

Research Collection School Of Computing and Information Systems

The growing use of automated decision-making in critical applications, such as crime prediction and college admission, has raised questions about fairness in machine learning. How can we decide whether different treatments are reasonable or discriminatory? In this paper, we investigate discrimination in machine learning from a visual analytics perspective and propose an interactive visualization tool, DiscriLens, to support a more comprehensive analysis. To reveal detailed information on algorithmic discrimination, DiscriLens identifies a collection of potentially discriminatory itemsets based on causal modeling and classification rules mining. By combining an extended Euler diagram with a matrix-based visualization, we develop a novel set ...


Tradao: A Visual Analytics System For Trading Algorithm Optimization, Ka Wing TSANG, Haotian LI, Fuk Ming LAM, Yifan MU, Yong WANG, Huamin QU 2021 Singapore Management University

Tradao: A Visual Analytics System For Trading Algorithm Optimization, Ka Wing Tsang, Haotian Li, Fuk Ming Lam, Yifan Mu, Yong Wang, Huamin Qu

Research Collection School Of Computing and Information Systems

With the wide applications of algorithmic trading, it has become critical for traders to build a winning trading algorithm to beat the market. However, due to the lack of efficient tools, traders mainly rely on their memory to manually compare the algorithm instances of a trading algorithm and further select the best trading algorithm instance for the real trading deployment. We work closely with industry practitioners to discover and consolidate user requirements and develop an interactive visual analytics system for trading algorithm optimization. Structured expert interviews are conducted to evaluateTradAOand a representative case study is documented for illustrating the system ...


Automatic Cerebrovascular Segmentation Methods - A Review, Fatma Taher, Neema Prakash 2021 Zayed University

Automatic Cerebrovascular Segmentation Methods - A Review, Fatma Taher, Neema Prakash

All Works

Cerebrovascular diseases are one of the serious causes for the increase in mortality rate in the world which affect the blood vessels and blood supply to the brain. In order, diagnose and study the abnormalities in the cerebrovascular system, accurate segmentation methods can be used. The shape, direction and distribution of blood vessels can be studied using automatic segmentation. This will help the doctors to envisage the cerebrovascular system. Due to the complex shape and topology, automatic segmentation is still a challenge to the clinicians. In this paper, some of the latest approaches used for segmentation of magnetic resonance angiography ...


Splash: Learnable Activation Functions For Improving Accuracy And Adversarial Robustness, Mohammadamin Tavakoli, Forest Agostinelli, Pierre Baldi 2021 University of California, Irvine

Splash: Learnable Activation Functions For Improving Accuracy And Adversarial Robustness, Mohammadamin Tavakoli, Forest Agostinelli, Pierre Baldi

Publications

We introduce SPLASH units, a class of learnable activation functions shown to simultaneously improve the accuracy of deep neural networks while also improving their robustness to adversarial attacks. SPLASH units have both a simple parameterization and maintain the ability to approximate a wide range of non-linear functions. SPLASH units are: (1) continuous; (2) grounded (f(0)=0"); (3) use symmetric hinges; and (4) their hinges are placed at fixed locations which are derived from the data (i.e. no learning required). Compared to nine other learned and fixed activation functions, including ReLU and its variants, SPLASH units show superior performance ...


Spatial Analyses Of Gray Fossil Site Vertebrate Remains: Implications For Depositional Setting And Site Formation Processes, David Carney 2021 East Tennessee State University

Spatial Analyses Of Gray Fossil Site Vertebrate Remains: Implications For Depositional Setting And Site Formation Processes, David Carney

Electronic Theses and Dissertations

This project uses exploratory 3D geospatial analyses to assess the taphonomy of the Gray Fossil Site (GFS). During the Pliocene, the GFS was a forested, inundated sinkhole that accumulated biological materials between 4.9-4.5 mya. This deposit contains fossils exhibiting different preservation modes: from low energy lacustrine settings to high energy colluvial deposits. All macro-paleontological materials have been mapped in situ using survey-grade instrumentation. Vertebrate skeletal material from the site is well-preserved, but the degree of skeletal articulation varies spatially within the deposit. This analysis uses geographic information systems (GIS) to analyze the distribution of mapped specimens at different ...


Algorithms For Covering Barrier Points By Mobile Sensors With Line Constraint, Princy Jain 2021 Utah State University

Algorithms For Covering Barrier Points By Mobile Sensors With Line Constraint, Princy Jain

All Graduate Theses and Dissertations

In this thesis, we develop efficient algorithms for the problem of covering barrier points by mobile sensors. Each sensor is represented by a point in the plane with the same covering range r so that any point within distance r from the sensor can be covered by the sensor. Given a set B of m points (called “barrier points”) and a set S of n points (representing the “sensors”) in the plane, the problem is to move the sensors so that each barrier point is covered by at least one sensor and the maximum movement of all sensors is minimized ...


Logbert: Log Anomaly Detection Via Bert, Haixuan Guo 2021 Utah State University

Logbert: Log Anomaly Detection Via Bert, Haixuan Guo

All Graduate Theses and Dissertations

When systems break down, administrators usually check the produced logs to diagnose the failures. Nowadays, systems grow larger and more complicated. It is labor-intensive to manually detect abnormal behaviors in logs. Therefore, it is necessary to develop an automated anomaly detection on system logs. Automated anomaly detection not only identifies malicious patterns promptly but also requires no prior domain knowledge. Many existing log anomaly detection approaches apply natural language models such as Recurrent Neural Network (RNN) to log analysis since both are based on sequential data. The proposed model, LogBERT, a BERT-based neural network, can capture the contextual information in ...


Fixed Pattern Noise Non-Uniformity Correction Through K-Means Clustering, Andres Imperial 2021 Utah State University

Fixed Pattern Noise Non-Uniformity Correction Through K-Means Clustering, Andres Imperial

All Graduate Theses and Dissertations

Imagery obtained with poorly calibrated sensors is often corrupted with fixed pattern noise. Fixed pattern noise presents itself through a non-uniform distribution and therefore is hard to target in noise removal. Traditional noise removal techniques assume that the noise is uniformly distributed and subsequently produces inadequate corrections. Noise correction methods that target fixed pattern noise rely on dynamically identifying present noise and adjust correction values appropriately using nearby information or general assumptions about the image’s composition. If noise identification is not accurate, the correction values will also suffer from low accuracy. Inaccurate correction values can affect the imagery’s ...


Comparative Study Of Machine Learning Models On Solar Flare Prediction Problem, Nikhil Sai Kurivella 2021 Utah State University

Comparative Study Of Machine Learning Models On Solar Flare Prediction Problem, Nikhil Sai Kurivella

All Graduate Theses and Dissertations

Solar flare events are explosions of energy and radiation from the Sun’s surface. These events occur due to the tangling and twisting of magnetic fields associated with sunspots. When Coronal Mass ejections accompany solar flares, solar storms could travel towards earth at very high speeds, disrupting all earthly technologies and posing radiation hazards to astronauts. For this reason, the prediction of solar flares has become a crucial aspect of forecasting space weather. Our thesis utilized the time-series data consisting of active solar region magnetic field parameters acquired from SDO that span more than eight years. The classification models take ...


Breast Ultrasound Image Segmentation Based On Uncertainty Reduction And Context Information, Kuan Huang 2021 Utah State University

Breast Ultrasound Image Segmentation Based On Uncertainty Reduction And Context Information, Kuan Huang

All Graduate Theses and Dissertations

Breast cancer frequently occurs in women over the world. It was one of the most serious diseases and the second common cancer among women in 2019. The survival rate of stages 0 and 1 of breast cancer is closed to 100%. It is urgent to develop an approach that can detect breast cancer in the early stages. Breast ultrasound (BUS) imaging is low-cost, portable, and effective; therefore, it becomes the most crucial approach for breast cancer diagnosis. However, BUS images are of poor quality, low contrast, and uncertain. The computer-aided diagnosis (CAD) system is developed for breast cancer to prevent ...


A Bert-Based Two-Stage Model For Chinese Chengyu Recommendation, Minghuan TAN, Jing JIANG, Bingtian DAI 2021 Singapore Management University

A Bert-Based Two-Stage Model For Chinese Chengyu Recommendation, Minghuan Tan, Jing Jiang, Bingtian Dai

Research Collection School Of Computing and Information Systems

In Chinese, Chengyu are fixed phrases consisting of four characters. As a type of idioms, their meanings usually cannot be derived from their component characters. In this paper, we study the task of recommending a Chengyu given a textual context. Observing some of the limitations with existing work, we propose a two-stage model, where during the first stage we re-train a Chinese BERT model by masking out Chengyu from a large Chinese corpus with a wide coverage of Chengyu. During the second stage, we fine-tune the retrained, Chengyu-oriented BERT on a specific Chengyu recommendation dataset. We evaluate this method on ...


Hierarchical Mapping For Crosslingual Word Embedding Alignment, Ion Madrazo Azpiazu, Maria Soledad Pera 2021 Boise State University

Hierarchical Mapping For Crosslingual Word Embedding Alignment, Ion Madrazo Azpiazu, Maria Soledad Pera

Computer Science Faculty Publications and Presentations

The alignment of word embedding spaces in different languages into a common crosslingual space has recently been in vogue. Strategies that do so compute pairwise alignments and then map multiple languages to a single pivot language (most often English). These strategies, however, are biased towards the choice of the pivot language, given that language proximity and the linguistic characteristics of the target language can strongly impact the resultant crosslingual space in detriment of topologically distant languages. We present a strategy that eliminates the need for a pivot language by learning the mappings across languages in a hierarchicalway. Experiments demonstrate that ...


Performance Evaluation Of Byzantine Fault Detection In Primary/Backup Systems, Sushant Mane 2021 San Jose State University

Performance Evaluation Of Byzantine Fault Detection In Primary/Backup Systems, Sushant Mane

Master's Projects

ZooKeeper masks crash failure of servers to provide a highly available, distributed coordination kernel; however, in production, not all failures are crash failures. Bugs in underlying software systems and hardware can corrupt the ZooKeeper replicas, leading to a data loss. Since ZooKeeper is used as a ‘source of truth’ for mission-critical applications, it should handle such arbitrary faults to safeguard reliability. Byzantine fault-tolerant (BFT) protocols were developed to handle such faults. However, these protocols are not suitable to build practical systems as they are expensive in all important dimensions: development, deployment, complexity, and performance. ZooKeeper takes an alternative approach that ...


Improving The Security And Performance Of Web Applications Running On The Distributed Ipfs, Vu Le 2021 San Jose State University

Improving The Security And Performance Of Web Applications Running On The Distributed Ipfs, Vu Le

Master's Projects

While cloud computing is gaining widespread adoption these days, some challenges are emerging around security, performance, and reliability of centralized cloud resources. Decentralized services are introduced as an effective way to overcome the limitations of cloud services. Blockchain technology with its associated decentralization is used to develop decentralized application platforms. The interplanetary file system (IPFS) is built on top of a distributed system consisting of a group of nodes that shares the data and also takes advantage of blockchain to permanently store the data. The IPFS is very useful in transferring data between people. This project focuses on blockchain technology ...


Analyzing Public Sentiment On Covid-19 Pandemic, Pradeepika Gedupudi 2021 San Jose State University

Analyzing Public Sentiment On Covid-19 Pandemic, Pradeepika Gedupudi

Master's Projects

Sentiment analysis is a method of understanding the user sentiment expressed in the form of text. Social media is the best place to capture the public's opinion regarding how they feel about current events. The Corona Virus Disease-2019 (COVID-19) is one of the worst pandemics we have experienced so far. An important observation is that this pandemic has not only affected the public's physical health but also took a toll on their mental health. Reddit is a social news discussion site where people discuss topics around current affairs in smaller groups called subreddits. The project's primary focus ...


Digital Commons powered by bepress