Open Access. Powered by Scholars. Published by Universities.®

Computer Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 9 of 9

Full-Text Articles in Computer Engineering

Distributed Knowledge Discovery For Diverse Data, Hossein Hamooni Jul 2017

Distributed Knowledge Discovery For Diverse Data, Hossein Hamooni

Computer Science ETDs

In the era of new technologies, computer scientists deal with massive data of size hundreds of terabytes. Smart cities, social networks, health care systems, large sensor networks, etc. are constantly generating new data. It is non-trivial to extract knowledge from big datasets because traditional data mining algorithms run impractically on such big datasets. However, distributed systems have come to aid this problem while introducing new challenges in designing scalable algorithms. The transition from traditional algorithms to the ones that can be run on a distributed platform should be done carefully. Researchers should design the modern distributed algorithms based on the …


Mining Capstone Project Wikis For Knowledge Discovery, Swapna Gottipati, Venky Shankararaman, Melvrivk Goh Jul 2017

Mining Capstone Project Wikis For Knowledge Discovery, Swapna Gottipati, Venky Shankararaman, Melvrivk Goh

Research Collection School Of Computing and Information Systems

Wikis are widely used collaborative environments as sources of information and knowledge. The facilitate students to engage in collaboration and share information among members and enable collaborative learning. In particular, Wikis play an important role in capstone projects. Wikis aid in various project related tasks and aid to organize information and share. Mining project Wikis is critical to understand the students learning and latest trends in industry. Mining Wikis is useful to educationists and academicians for decision-making about how to modify the educational environment to improve student's learning. The main challenge is that the content or data in project Wikis …


Development Of An Enhanced Generic Data Mining Life Cycle (Dmlc), Markus Hofmann, Brendan Tierney May 2017

Development Of An Enhanced Generic Data Mining Life Cycle (Dmlc), Markus Hofmann, Brendan Tierney

The ITB Journal

Data mining projects are complex and have a high failure rate. In order to improve project management and success rates of such projects a life cycle is vital to the overall success of the project. This paper reports on a research project that was concerned with the life cycle development for large scale data mining projects. The paper provides a detailed view of the design and development of a generic data mining life cycle called DMLC. The life cycle aims to support all members of data mining project teams as well as IT managers and academic researchers and may improve …


Peeking Into The Other Half Of The Glass : Handling Polarization In Recommender Systems., Mahsa Badami May 2017

Peeking Into The Other Half Of The Glass : Handling Polarization In Recommender Systems., Mahsa Badami

Electronic Theses and Dissertations

This dissertation is about filtering and discovering information online while using recommender systems. In the first part of our research, we study the phenomenon of polarization and its impact on filtering and discovering information. Polarization is a social phenomenon, with serious consequences, in real-life, particularly on social media. Thus it is important to understand how machine learning algorithms, especially recommender systems, behave in polarized environments. We study polarization within the context of the users' interactions with a space of items and how this affects recommender systems. We first formalize the concept of polarization based on item ratings and then relate …


Mining Sequences Of Developer Interactions In Visual Studio For Usage Smells, Kostadin Damevski, David C. Shepherd, Johannes Schneider, Lori Pollock Jan 2017

Mining Sequences Of Developer Interactions In Visual Studio For Usage Smells, Kostadin Damevski, David C. Shepherd, Johannes Schneider, Lori Pollock

Computer Science Publications

In this paper, we present a semi-automatic approach for mining a large-scale dataset of IDE interactions to extract usage smells, i.e., inefficient IDE usage patterns exhibited by developers in the field. The approach outlined in this paper first mines frequent IDE usage patterns, filtered via a set of thresholds and by the authors, that are subsequently supported (or disputed) using a developer survey, in order to form usage smells. In contrast with conventional mining of IDE usage data, our approach identifies time-ordered sequences of developer actions that are exhibited by many developers in the field. This pattern mining workflow is …


Dtreesim: A New Approach To Compute Decision Tree Similarity Using Re-Mining, Gözde Bakirli, Derya Bi̇rant Jan 2017

Dtreesim: A New Approach To Compute Decision Tree Similarity Using Re-Mining, Gözde Bakirli, Derya Bi̇rant

Turkish Journal of Electrical Engineering and Computer Sciences

A number of recent studies have used a decision tree approach as a data mining technique; some of them needed to evaluate the similarity of decision trees to compare the knowledge reflected in different trees or datasets. There have been multiple perspectives and multiple calculation techniques to measure the similarity of two decision trees, such as using a simple formula or an entropy measure. The main objective of this study is to compute the similarity of decision trees using data mining techniques. This study proposes DTreeSim, a new approach that applies multiple data mining techniques (classification, sequential pattern mining, and …


Discovering The Relationships Between Yarn And Fabric Properties Using Association Rule Mining, Peli̇n Yildirim, Derya Bi̇rant, Tuba Alpyildiz Jan 2017

Discovering The Relationships Between Yarn And Fabric Properties Using Association Rule Mining, Peli̇n Yildirim, Derya Bi̇rant, Tuba Alpyildiz

Turkish Journal of Electrical Engineering and Computer Sciences

Investigation of the effects of yarn parameters on fabric quality and finding important parameters to achieve desired fabric properties are important issues for the design process with the aim to meet the needs of the textile industry and the consumer for complex and specific requirements of functionality. Despite many statistical and mathematical studies that predict and reveal specific properties of utilized yarn and fabric materials, a number of challenges continue to exist when evaluated in many perspectives, such as discovering complex relationships among material properties in data. Data mining plays an important role in discovering hidden patterns from fabric data …


An Ant Colony Optimization Algorithm-Based Classification For The Diagnosis Of Primary Headaches Using A Website Questionnaire Expert System, Ufuk Çeli̇k, Ni̇lüfer Yurtay Jan 2017

An Ant Colony Optimization Algorithm-Based Classification For The Diagnosis Of Primary Headaches Using A Website Questionnaire Expert System, Ufuk Çeli̇k, Ni̇lüfer Yurtay

Turkish Journal of Electrical Engineering and Computer Sciences

The purpose of this research was to evaluate the classification accuracy of the ant colony optimization algorithm for the diagnosis of primary headaches using a website questionnaire expert system that was completed by patients. This cross-sectional study was conducted in 850 headache patients who randomly applied to hospital from three cities in Turkey with the assistance of a neurologist in each city. The patients filled in a detailed web-based headache questionnaire. Finally, neurologists' diagnosis results were compared with the classification results of an ant colony optimization-based classification algorithm. The ant colony algorithm for diagnosis classified patients with 96.9412% overall accuracy. …


Proposing A New Clustering Method To Detect Phishing Websites, Morteza Arab, Mohammad Karim Sohrabi Jan 2017

Proposing A New Clustering Method To Detect Phishing Websites, Morteza Arab, Mohammad Karim Sohrabi

Turkish Journal of Electrical Engineering and Computer Sciences

Phishing websites are fake ones that are developed by ill-intentioned people to imitate real and legal websites. Most of these types of web pages have high visual similarities to hustle the victims. The victims of phishing websites may give their bank accounts, passwords, credit card numbers, and other important information to the designers and owners of phishing websites. The increasing number of phishing websites has become a great challenge in e-business in general and in electronic banking specifically. In the present study, a novel framework based on model-based clustering is introduced to fight against phishing websites. First, a model is …