Open Access. Powered by Scholars. Published by Universities.®

Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Departmental Technical Reports (CS)

Machine learning

Articles 1 - 2 of 2

Full-Text Articles in Mathematics

Why Softmax? Because It Is The Only Consistent Approach To Probability-Based Classification, Anatole Lokshin, Vladik Kreinovich Jun 2023

Why Softmax? Because It Is The Only Consistent Approach To Probability-Based Classification, Anatole Lokshin, Vladik Kreinovich

Departmental Technical Reports (CS)

In many practical problems, the most effective classification techniques are based on deep learning. In this approach, once the neural network generates values corresponding to different classes, these values are transformed into probabilities by using the softmax formula. Researchers tried other transformation, but they did not work as well as softmax. A natural question is: why is softmax so effective? In this paper, we provide a possible explanation for this effectiveness: namely, we prove that softmax is the only consistent approach to probability-based classification. In precise terms, it is the only approach for which two reasonable probability-based ideas -- Least …


Fast -- Asymptotically Optimal -- Methods For Determining The Optimal Number Of Features, Saied Tizpaz-Niari, Luc Longpré, Olga Kosheleva, Vladik Kreinovich May 2023

Fast -- Asymptotically Optimal -- Methods For Determining The Optimal Number Of Features, Saied Tizpaz-Niari, Luc Longpré, Olga Kosheleva, Vladik Kreinovich

Departmental Technical Reports (CS)

In machine learning -- and in data processing in general -- it is very important to select the proper number of features. If we select too few, we miss important information and do not get good results, but if we select too many, this will include many irrelevant ones that only bring noise and thus again worsen the results. The usual method of selecting the proper number of features is to add features one by one until the quality stops improving and starts deteriorating again. This method works, but it often takes too much time. In this paper, we propose …