Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems

PDF

Singapore Management University

2021

Machine learning

Articles 1 - 4 of 4

Full-Text Articles in Physical Sciences and Mathematics

Building Legal Datasets, Jerrold Soh Nov 2021

Building Legal Datasets, Jerrold Soh

Research Collection Yong Pung How School Of Law

Data-centric AI calls for better, not just bigger, datasets. As data protection laws with extra-territorial reach proliferate worldwide, ensuring datasets are legal is an increasingly crucial yet overlooked component of “better”. To help dataset builders become more willing and able to navigate this complex legal space, this paper reviews key legal obligations surrounding ML datasets, examines the practical impact of data laws on ML pipelines, and offers a framework for building legal datasets.


Binary Classifiers For Noisy Datasets: A Comparative Study Of Existing Quantum Machine Learning Frameworks And Some New Approaches, Nikolaos Schetakis, Davit Aghamalyan, Paul Robert Griffin, Michael Boguslavsky Nov 2021

Binary Classifiers For Noisy Datasets: A Comparative Study Of Existing Quantum Machine Learning Frameworks And Some New Approaches, Nikolaos Schetakis, Davit Aghamalyan, Paul Robert Griffin, Michael Boguslavsky

Research Collection School Of Computing and Information Systems

This technology offer is a quantum machine learning algorithm applied to binary classification models for noisy datasets which are prevalent in financial and other datasets. By combining hybrid-neural networks, quantum parametric circuits, and data re-uploading we have improved the classification of non-convex 2-dimensional figures by understanding learning stability as noise increases in the dataset. The metric we use for assessing the performance of our quantum classifiers is the area under the receiver operator curve (ROC AUC). We are interested to collaborate with partners with use cases for binary classification of noisy data. Also, as quantum technology is still insufficient for …


Measuring Data Collection Diligence For Community Healthcare, Galawala Ramesha Samurdhi Karunasena, M. S. Ambiya, Arunesh Sinha, R. Nagar, S. Dalal, Abdullah. H., D. Thakkar, D. Narayanan, M. Tambe Oct 2021

Measuring Data Collection Diligence For Community Healthcare, Galawala Ramesha Samurdhi Karunasena, M. S. Ambiya, Arunesh Sinha, R. Nagar, S. Dalal, Abdullah. H., D. Thakkar, D. Narayanan, M. Tambe

Research Collection School Of Computing and Information Systems

Data analytics has tremendous potential to provide targeted benefit in low-resource communities, however the availability of highquality public health data is a significant challenge in developing countries primarily due to non-diligent data collection by community health workers (CHWs). Our use of the word non-diligence here is to emphasize that poor data collection is often not a deliberate action by CHW but arises due to a myriad of factors, sometime beyond the control of the CHW. In this work, we define and test a data collection diligence score. This challenging unlabeled data problem is handled by building upon domain expert’s guidance …


Revman: Revenue-Aware Multi-Task Online Insurance Recommendation, Yu Li, Yi Zhang, Lu Gan, Gengwei Hong, Zimu Zhou, Qiang Li Feb 2021

Revman: Revenue-Aware Multi-Task Online Insurance Recommendation, Yu Li, Yi Zhang, Lu Gan, Gengwei Hong, Zimu Zhou, Qiang Li

Research Collection School Of Computing and Information Systems

Online insurance is a new type of e-commerce with exponential growth. An effective recommendation model that maximizes the total revenue of insurance products listed in multiple customized sales scenarios is crucial for the success of online insurance business. Prior recommendation models are ineffective because they fail to characterize the complex relatedness of insurance products in multiple sales scenarios and maximize the overall conversion rate rather than the total revenue. Even worse, it is impractical to collect training data online for total revenue maximization due to the business logic of online insurance. We propose RevMan, a Revenue-aware Multi-task Network for online …