Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 2 of 2

Full-Text Articles in Physical Sciences and Mathematics

Big Data With Cloud Computing: Discussions And Challenges, Amanpreet Kaur Sandhu Mar 2022

Big Data With Cloud Computing: Discussions And Challenges, Amanpreet Kaur Sandhu

Big Data Mining and Analytics

With the recent advancements in computer technologies, the amount of data available is increasing day by day. However, excessive amounts of data create great challenges for users. Meanwhile, cloud computing services provide a powerful environment to store large volumes of data. They eliminate various requirements, such as dedicated space and maintenance of expensive computer hardware and software. Handling big data is a time-consuming task that requires large computational clusters to ensure successful data storage and processing. In this work, the definition, classification, and characteristics of big data are discussed, along with various cloud services, such as Microsoft Azure, Google Cloud, …


Randomized And Evolutionary Approaches To Dataset Characterization, Feature Weighting, And Sampling In K-Nearest Neighbors, Suryoday Basak May 2020

Randomized And Evolutionary Approaches To Dataset Characterization, Feature Weighting, And Sampling In K-Nearest Neighbors, Suryoday Basak

Computer Science and Engineering Theses

K-Nearest Neighbors (KNN) has remained one of the most popular methods for supervised machine learning tasks. However, its performance often depends on the characteristics of the dataset and on appropriate feature scaling. In this thesis, characteristics of a dataset that make it suitable for being used within KNN are explored. As part of this, two new measures for dataset dispersion, called mean neighborhood target variance (MNTV), and mean neighborhood target entropy (MNTE) are developed to help determine the performance we expect while using KNN regressors and classifiers, respectively. It is empirically demonstrated that these measures of dispersion can be indicative …