Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 37

Full-Text Articles in Physical Sciences and Mathematics

Feature Selection Optimization With Filtering And Wrapper Methods: Two Disease Classification Cases, Serhat Ati̇k, Tuğba Dalyan Nov 2023

Feature Selection Optimization With Filtering And Wrapper Methods: Two Disease Classification Cases, Serhat Ati̇k, Tuğba Dalyan

Turkish Journal of Electrical Engineering and Computer Sciences

Discarding the less informative and redundant features helps to reduce the time required to train a learning algorithm and the amount of storage required, improving the learning accuracy as well as the quality of results. In this study, we present different feature selection approaches to address the problem of disease classification based on the Parkinson and Cardiac Arrhythmia datasets. For this purpose, first we utilize three filtering algorithms including the Pearson correlation coefficient, Spearman correlation coefficient, and relief. Second, metaheuristic algorithms are compared to find the most informative subset of the features to obtain better classification accuracy. As a final …


Crow Search Algorithm With Time Varying Flight Length Strategies For Feature Selection, Mohammed Abdullahi, Abdulhameed Adamu, Ibrahim Hayatu Hassan Jan 2023

Crow Search Algorithm With Time Varying Flight Length Strategies For Feature Selection, Mohammed Abdullahi, Abdulhameed Adamu, Ibrahim Hayatu Hassan

Future Computing and Informatics Journal

Feature Selection (FS) is an efficient technique use to get rid of irrelevant, redundant and noisy attributes in high dimensional datasets while increasing the efficacy of machine learning classification. The CSA is a modest and efficient metaheuristic algorithm which has been used to overcome several FS issues. The flight length (fl) parameter in CSA governs crows' search ability. In CSA, fl is set to a fixed value. As a result, the CSA is plagued by the problem of being hoodwinked in local minimum. This article suggests a remedy to this issue by bringing five new concepts of time dependent fl …


Load2load: Day-Ahead Load Forecasting At Aggregated Level, Mustafa Berkay Yilmaz Nov 2022

Load2load: Day-Ahead Load Forecasting At Aggregated Level, Mustafa Berkay Yilmaz

Turkish Journal of Electrical Engineering and Computer Sciences

A reliable and accurate short-term load forecasting (STLF) helps utilities and energy providers deal with the challenges posed by supply and demand balance, higher penetration of renewable energies and the development of electricity markets with increasingly complex pricing strategies in future smart grids. Recent advances in deep learning have been successively utilized to STLF. However, there is no certain study that evaluates the performances of different STLF methods at an aggregated level on different datasets with different numbers of daily measurements.In this study, a deep learning STLF architecture called Load2Load is proposed for day-ahead forecasting. Different forecasting methods have been …


Development Of A Hybrid System Based On Abc Algorithm For Selection Of Appropriate Parameters For Disease Diagnosis From Ecg Signals, Ersi̇n Ersoy, Gazi̇ Erkan Bostanci, Mehmet Serdar Güzel Jul 2022

Development Of A Hybrid System Based On Abc Algorithm For Selection Of Appropriate Parameters For Disease Diagnosis From Ecg Signals, Ersi̇n Ersoy, Gazi̇ Erkan Bostanci, Mehmet Serdar Güzel

Turkish Journal of Electrical Engineering and Computer Sciences

The number of people who die due to cardiovascular diseases is quite high. In our study, ECG (electrocar-diogram) signals were divided into segments and waves based on temporal boundaries. Signal similarity methods such as convolution, correlation, covariance, signal peak to noise ratio (PNRS), structural similarity index (SSIM), one of the basic statistical parameters, arithmetic mean and entropy were applied to each of these sections. In addition, a square error-based new approach was applied and the difference of the signs from the mean sign was taken and used as a feature vector. The obtained feature vectors are used in the artificial …


Toward Intelligent Financial Advisors For Identifying Potential Clients: A Multitask Perspective, Qixiang Shao, Runlong Yu, Hongke Zhao, Chunli Liu, Mengyi Zhang, Hongmei Song, Qi Liu Mar 2022

Toward Intelligent Financial Advisors For Identifying Potential Clients: A Multitask Perspective, Qixiang Shao, Runlong Yu, Hongke Zhao, Chunli Liu, Mengyi Zhang, Hongmei Song, Qi Liu

Big Data Mining and Analytics

Intelligent Financial Advisors (IFAs) in online financial applications (apps) have brought new life to personal investment by providing appropriate and high-quality portfolios for users. In real-world scenarios, identifying potential clients is a crucial issue for IFAs, i.e., identifying users who are willing to purchase the portfolios. Thus, extracting useful information from various characteristics of users and further predicting their purchase inclination are urgent. However, two critical problems encountered in real practice make this prediction task challenging, i.e., sample selection bias and data sparsity. In this study, we formalize a potential conversion relationship, i.e., user→activated user→client and decompose this relationship into …


Performance Analysis And Feature Selection For Network-Based Intrusion Detectionwith Deep Learning, Serhat Caner, Nesli̇ Erdoğmuş, Yusuf Murat Erten Mar 2022

Performance Analysis And Feature Selection For Network-Based Intrusion Detectionwith Deep Learning, Serhat Caner, Nesli̇ Erdoğmuş, Yusuf Murat Erten

Turkish Journal of Electrical Engineering and Computer Sciences

An intrusion detection system is an automated monitoring tool that analyzes network traffic and detects malicious activities by looking out either for known patterns of attacks or for an anomaly. In this study, intrusion detection and classification performances of different deep learning based systems are examined. For this purpose, 24 deep neural networks with four different architectures are trained and evaluated on CICIDS2017 dataset. Furthermore, the best performing model is utilized to inspect raw network traffic features and rank them with respect to their contributions to success rates. By selecting features with respect to their ranks, sets of varying size …


Application Of Entropy Method For Estimating Factor Weights In Mining-Method Selection For Development Of Novel Mining-Method Selection System, Elsa Pansilvania Andre Manjate, Mahdi Saadat, Hisatoshi Toriya, Fumiaki Inagaki, Youhei Kawamura Jan 2022

Application Of Entropy Method For Estimating Factor Weights In Mining-Method Selection For Development Of Novel Mining-Method Selection System, Elsa Pansilvania Andre Manjate, Mahdi Saadat, Hisatoshi Toriya, Fumiaki Inagaki, Youhei Kawamura

Journal of Sustainable Mining

Mining-method selection (MMS) is one of the most critical and complex decisionmaking processes in mine planning. Therefore, it has been a subject of several studies for many years culminating with the development of different systems. However, there is still more to be done to improve and/or create more efficient systems and deal with the complexity caused by many influencing factors. This study introduces the application of the entropy method for feature selection, i.e., select the most critical factors in MMS. The entropy method is applied to assess the relative importance of the factors influencing MMS by estimating their objective weights …


Decomposition Furnace Outlet Temperature Prediction Based On Elasticnet And Lstm, Guangyu Yu, Xueping Dong, Xiangmin Wang, Gan Min Jun 2021

Decomposition Furnace Outlet Temperature Prediction Based On Elasticnet And Lstm, Guangyu Yu, Xueping Dong, Xiangmin Wang, Gan Min

Journal of System Simulation

Abstract: The outlet temperature of the decomposition furnace is a key indicator in the cement production process. Aiming at the problem that traditional prediction methods only consider the influence of wind, coal, and materials, a temperature prediction model of ElasticNet combined with Long Short-Term Memory (LSTM) neural network is proposed. The ElasticNet-LSTM export temperature prediction model is constructed by using the ElasticNet method to estimate the parameters of different variables, fully considering the influencing factors and realizing the variable screening, and analyzing the influence of the number of hidden layers and nodes on the accuracy of the neural network. Simulation …


Vif-Regression Screening Ultrahigh Dimensional Feature Space, Hassan S. Uraibi Jun 2021

Vif-Regression Screening Ultrahigh Dimensional Feature Space, Hassan S. Uraibi

Journal of Modern Applied Statistical Methods

Iterative Sure Independent Screening (ISIS) was proposed for the problem of variable selection with ultrahigh dimensional feature space. Unfortunately, the ISIS method transforms the dimensionality of features from ultrahigh to ultra-low and may result in un-reliable inference when the number of important variables particularly is greater than the screening threshold. The proposed method has transformed the ultrahigh dimensionality of features to high dimension space in order to remedy of losing some information by ISIS method. The proposed method is compared with ISIS method by using real data and simulation. The results show this method is more efficient and more reliable …


Gene Expression Data Classification Using Genetic Algorithm-Basedfeature Selection, Öznur Si̇nem Sönmez, Mustafa Dağteki̇n, Tolga Ensari̇ Jan 2021

Gene Expression Data Classification Using Genetic Algorithm-Basedfeature Selection, Öznur Si̇nem Sönmez, Mustafa Dağteki̇n, Tolga Ensari̇

Turkish Journal of Electrical Engineering and Computer Sciences

In this study, hybrid methods are proposed for feature selection and classification of gene expression datasets. In the proposed genetic algorithm/support vector machine (GA-SVM) and genetic algorithm/k nearest neighbor (GA-KNN) hybrid methods, genetic algorithm is improved using Pearson's correlation coefficient, Relief-F, or mutual information. Crossover and selection operations of the genetic algorithm are specialized. Eight different gene expression datasets are used for classification process. The classification performances of the proposed methods are compared with the traditional GA-KNN and GA-SVM wrapper methods and other studies in the literature. Classification results demonstrate that higher accuracy rates are obtained with the proposed methods …


Sar Object Recognition Based On Multi-Band And Multi-Polarization Simulation Image, Gu Yu, Zhang Qin, Xu Ying Jun 2020

Sar Object Recognition Based On Multi-Band And Multi-Polarization Simulation Image, Gu Yu, Zhang Qin, Xu Ying

Journal of System Simulation

Abstract: The object model was built based on Creator, and object texture-material mapping was performed by Vega TMM tool. The multi-band and multi-polarization SAR image database was built by visual simulation technology. A hybrid intelligent optimization algorithm was designed to optimize combination of band and polarization by genetic algorithm and binary particle optimization. Zernike moment features, Gabor wavelet coefficients, etc were extracted from original image and rectified image to make up of feature candidates, and the feature selection experiments were carried out by using multi-band and multi-polarization SAR images. Simulation results demonstrate that, building SAR image database through simulation …


Event-Based Summarization Of News Articles, Feri̇de Savaroğlu Tabak, Vesi̇le Evri̇m Jan 2020

Event-Based Summarization Of News Articles, Feri̇de Savaroğlu Tabak, Vesi̇le Evri̇m

Turkish Journal of Electrical Engineering and Computer Sciences

In recent years, with the increase of available digital information on the Web, the time needed to find relevant information is also increased. Therefore, to reduce the time spent on searching, research on automatic text summarization has gained importance. The proposed summarization process is based on event extraction methods and is called an event-based extractive single-document summarization. In this method, the important features of event extraction and summarization methods are analyzed and combined together to extract the summaries from single-source news documents. Among the tested features, six features are found to be the most effective in constructing good summaries. The …


On A Yearly Basis Prediction Of Soil Water Content Utilizing Sar Data: A Machinelearning And Feature Selection Approach, Emrullah Acar, Mehmet Si̇raç Özerdem Jan 2020

On A Yearly Basis Prediction Of Soil Water Content Utilizing Sar Data: A Machinelearning And Feature Selection Approach, Emrullah Acar, Mehmet Si̇raç Özerdem

Turkish Journal of Electrical Engineering and Computer Sciences

Soil water content (SWC) performs an important role in many areas including agriculture, drought cases, usage of water resources, hydrology, crop diseases and aerology. However, the measurement of the SWC over large terrains with standard computational techniques is very hard. In order to overcome this situation, remote sensing tools are preferred, which can produce much more successful results in less time than standard calculation techniques. Among all remote sensing tools, synthetic aperture radar (SAR) has a significant impact on determining SWC over large terrains. The main objective of this study is to predict SWC on a yearly basis over the …


Towards Human Activity Recognition For Ubiquitous Health Care Using Data From Awaist-Mounted Smartphone, Umar Zia, Wajeeha Khalil, Salabat Khan, Iftikhar Ahmad, Naeem Khatak Jan 2020

Towards Human Activity Recognition For Ubiquitous Health Care Using Data From Awaist-Mounted Smartphone, Umar Zia, Wajeeha Khalil, Salabat Khan, Iftikhar Ahmad, Naeem Khatak

Turkish Journal of Electrical Engineering and Computer Sciences

Understanding human activities is a newly emerging paradigm that is greatly involved in developing ubiquitous health care (u-Health) systems. The aim of these systems is to seamlessly gather knowledge about the patient?s health and, after collecting knowledge, make suggestions to the patient according to his/her health profile. For this purpose, one of the most important ubiquitous communication trends is the smartphone, which has drawn the attention of both professionals and caregivers for monitoring the aging population, childcare, fall detection, and cognitive impairment. Recognizing human actions in a ubiquitous environment is very challenging and researchers have extensively investigated different methods to …


Noise Clipping Algorithm Based On Relative Contribution Rate, Shuoyu Liu, Yueming Dai Dec 2019

Noise Clipping Algorithm Based On Relative Contribution Rate, Shuoyu Liu, Yueming Dai

Journal of System Simulation

Abstract: This paper presents a class noise cutting algorithm (Class noise cutting, CNC) based on relative contribution rate. The algorithm calculates the relative contribution rate of features to the theme. The most valuable feature set is selected by using features distinguish rating. The corresponding candidate categories for each feature are selected, to reduece the candidate category set, improves the classification accuracy, and speed up the response speed of the classifier. Compared with another ECN noise cutting algorithm (Eliminating the class whose), CNC-has higher accuracy and because of its simpler feature dimension dictionary and better candidate category set, the response …


Coal Mine Water Inrush Prediction Based On Lstm Neural Network, Dong Lili, Fei Cheng, Zhang Xiang, Cao Chaofan Feb 2019

Coal Mine Water Inrush Prediction Based On Lstm Neural Network, Dong Lili, Fei Cheng, Zhang Xiang, Cao Chaofan

Coal Geology & Exploration

According to the prediction of water inrush from coal seam floor, based on the summarization of existing water inrush prediction methods and theories, the feature selection experiment shows that water pressure, distance from the working surface, sandstone section thickness, coal thickness, coal seam inclination, fault throw, fissure zone, mining area, mining height and strike length are the main factors affecting the occurrence of water inrush. These factors are complex and non-linear. A water inrush prediction model based on long short-term memory(LSTM) neural network was proposed. The data of the coal mine water inrush case was used as sample data to …


Improving Anomaly Detection In Bgp Time-Series Data By New Guide Features And Moderated Feature Selection Algorithm, Mahmoud Hashem, Ahmed Bashandy, Samir Shaheen Jan 2019

Improving Anomaly Detection In Bgp Time-Series Data By New Guide Features And Moderated Feature Selection Algorithm, Mahmoud Hashem, Ahmed Bashandy, Samir Shaheen

Turkish Journal of Electrical Engineering and Computer Sciences

The Internet infrastructure relies on the Border Gateway Protocol (BGP) to provide essential routing information where abnormal routing behavior impairs global Internet connectivity and stability. Hence, employing anomaly detection algorithms is important for improving the performance of BGP routing protocol. In this paper, we propose two algorithms; the first is the guide feature generator (GFG), which generates guide features from traditional features in BGP time-series data using moving regression in combination with smoothed moving average. The second is a modified random forest feature selection algorithm which is employed to automatically select the most dominant features (ASMDF). Our mechanism shows that …


Optimal Set Of Eeg Features In Infant Sleep Stage Classification, Maja Cic, Mario Milicevic, Igor Mazic Jan 2019

Optimal Set Of Eeg Features In Infant Sleep Stage Classification, Maja Cic, Mario Milicevic, Igor Mazic

Turkish Journal of Electrical Engineering and Computer Sciences

This paper evaluates six classification algorithms to assess the importance of individual EEG rhythms in the context of automatic classification of infant sleep. EEG features were obtained by Fourier transform and by a novel technique based on the empirical mode decomposition and generalized zero crossing method. Of six evaluated classification algorithms, the best classification results were obtained with the support vector machine for the combination of all presented features from four EEG channels. Three methods of attribute ranking were assessed: relief, principal component analysis, and wrapper-based optimized attribute weights. The outcomes revealed that the optimal selection of features requires one …


A Novel Hybrid Teaching-Learning-Based Optimization Algorithm For The Classification Of Data By Using Extreme Learning Machines, Ender Sevi̇nç, Tansel Dökeroğlu Jan 2019

A Novel Hybrid Teaching-Learning-Based Optimization Algorithm For The Classification Of Data By Using Extreme Learning Machines, Ender Sevi̇nç, Tansel Dökeroğlu

Turkish Journal of Electrical Engineering and Computer Sciences

Data classification is the process of organizing data by relevant categories. In this way, the data can be understood and used more efficiently by scientists. Numerous studies have been proposed in the literature for the problem of data classification. However, with recently introduced metaheuristics, it has continued to be riveting to revisit this classical problem and investigate the efficiency of new techniques. Teaching-learning-based optimization (TLBO) is a recent metaheuristic that has been reported to be very effective for combinatorial optimization problems. In this study, we propose a novel hybrid TLBO algorithm with extreme learning machines (ELM) for the solution of …


Combined Feature Compression Encoding In Image Retrieval, Lu Huo, Leijie Zhang Jan 2019

Combined Feature Compression Encoding In Image Retrieval, Lu Huo, Leijie Zhang

Turkish Journal of Electrical Engineering and Computer Sciences

Recently, features extracted by convolutional neural networks (CNNs) are popularly used for image retrieval. In CNN representation, high-level features are usually chosen to represent the images in coarse-grained datasets, while mid-level features are successfully applied to describe the images for fine-grained datasets. In this paper, we combine these different levels of features as a joint feature to propose a robust representation that is suitable for both coarse-grained and fine-grained image retrieval datasets. In addition, in order to solve the problem that the efficiency of image retrieval is influenced by the dimensionality of indexing, a unified subspace learning model named spectral …


Toxicity Prediction Of Small Drug Molecules Of Aryl Hydrocarbon Receptor Using Aproposed Ensemble Model, Vishan Kumar Gupta, Prashant Singh Rana Jan 2019

Toxicity Prediction Of Small Drug Molecules Of Aryl Hydrocarbon Receptor Using Aproposed Ensemble Model, Vishan Kumar Gupta, Prashant Singh Rana

Turkish Journal of Electrical Engineering and Computer Sciences

Quantitative structure-activity relationships and quantitative structure?property relationships have proved their usefulness for predicting toxicities of drug molecules regarding their biological activities. In silico toxicity prediction techniques are essential for reducing testing on rodents (in vivo) and for a less time-consuming and more cost-efficient alternative for the identification of toxic effects at an early stage of drug development. The authors aim to build a prediction model for better assessment of toxicity to quickly and efficiently test whether certain chemical compounds have the potential to disrupt the processes in the human body that may adversely affect human health. Here, we have proposed …


A Hybrid Feature-Selection Approach For Finding The Digital Evidence Of Web Application Attacks, Mohammed Babiker, Eni̇s Karaarslan, Yaşar Hoşcan Jan 2019

A Hybrid Feature-Selection Approach For Finding The Digital Evidence Of Web Application Attacks, Mohammed Babiker, Eni̇s Karaarslan, Yaşar Hoşcan

Turkish Journal of Electrical Engineering and Computer Sciences

The most critical challenge of web attack forensic investigations is the sheer amount of data and level of complexity. Machine learning technology might be an efficient solution for web attack analysis and investigation. Consequently, machine learning applications have been applied in various areas of information security and digital forensics, and have improved over time. Moreover, feature selection is a crucial step in machine learning; in fact, selecting an optimal feature subset could enhance the accuracy and performance of the predictive model. To date, there has not been an adequate approach to select optimal features for the evidence of web attack. …


An Improved Tree Model Based On Ensemble Feature Selection For Classification, Chandralekha M, Shenbagavadivu N Jan 2019

An Improved Tree Model Based On Ensemble Feature Selection For Classification, Chandralekha M, Shenbagavadivu N

Turkish Journal of Electrical Engineering and Computer Sciences

Researchers train and build specific models to classify the presence and absence of a disease and the accuracy of such classification models is continuously improved. The process of building a model and training depends on the medical data utilized. Various machine learning techniques and tools are used to handle different data with respect to disease types and their clinical conditions. Classification is the most widely used technique to classify disease and the accuracy of the classifier largely depends on the attributes. The choice of the attribute largely affects the diagnosis and performance of the classifier. Due to growing large volumes …


Effect Of Intuitionistic Fuzzy Normalization In Microarray Gene Selection, Prema Ramasamy, Premalatha Kandhasamy Jan 2018

Effect Of Intuitionistic Fuzzy Normalization In Microarray Gene Selection, Prema Ramasamy, Premalatha Kandhasamy

Turkish Journal of Electrical Engineering and Computer Sciences

Analysis of gene expression data is essential in microarray gene expression in order to retrieve the required information. Gene expression data generally contain a large number of genes but a small number of samples. The complicated relations among the different genes make analysis more difficult, and removing irrelevant genes improves the quality of results. This paper presents two fuzzy preprocessing techniques, using a fuzzy set (FS) and intuitionistic fuzzy set (IFS), to normalize datasets. In the feature selection part, four statistical methods were used. Using three publicly available gene expression datasets, the fuzzy normalization techniques were compared with two standard …


Feature Selection Algorithm For No-Reference Image Quality Assessment Using Natural Scene Statistics, Imran Fareed Nizami, Muhammad Majid, Khawar Khurshid Jan 2018

Feature Selection Algorithm For No-Reference Image Quality Assessment Using Natural Scene Statistics, Imran Fareed Nizami, Muhammad Majid, Khawar Khurshid

Turkish Journal of Electrical Engineering and Computer Sciences

Images play an essential part in our daily lives and the performance of various imaging applications is dependent on the user?s quality of experience. No-reference image quality assessment (NR-IQA) has gained importance to assess the perceived quality, without using any prior information of the nondistorted version of the image. Different NR-IQA techniques that utilize natural scene statistics classify the distortion type based on groups of features and then these features are used for estimating the image quality score. However, every type of distortion has a different impact on certain sets of features. In this paper, a new feature selection algorithm …


An Online Approach For Feature Selection For Classification In Big Data, Nasrin Banu Nazar, Radha Senthilkumar Jan 2017

An Online Approach For Feature Selection For Classification In Big Data, Nasrin Banu Nazar, Radha Senthilkumar

Turkish Journal of Electrical Engineering and Computer Sciences

Feature selection (FS), also known as attribute selection, is a process of selection of a subset of relevant features used in model construction. This process or method improves the classification accuracy by removing irrelevant and noisy features. FS is implemented using either batch learning or online learning. Currently, the FS methods are executed in batch learning. Nevertheless, these techniques take longer execution time and require larger storage space to process the entire dataset. Due to the lack of scalability, the batch learning process cannot be used for large data. In the present study, a scalable efficient Online Feature Selection (OFS) …


A Fast Feature Selection Approach Based On Extreme Learning Machine And Coefficient Of Variation, Ömer Faruk Ertuğrul, Mehmet Emi̇n Tağluk Jan 2017

A Fast Feature Selection Approach Based On Extreme Learning Machine And Coefficient Of Variation, Ömer Faruk Ertuğrul, Mehmet Emi̇n Tağluk

Turkish Journal of Electrical Engineering and Computer Sciences

Feature selection is the method of reducing the size of data without degrading their accuracy. In this study, we propose a novel feature selection approach, based on extreme learning machines (ELMs) and the coefficient of variation (CV). In the proposed approach, the most relevant features are identified by ranking each feature with the coefficient obtained through ELM divided by CV. The achieved accuracies and computational costs, obtained with the use of features selected via the proposed approach in 9 classification and 26 regression benchmark data sets, were compared to those obtained with all features, as well as those obtained with …


Stock Daily Return Prediction Using Expanded Features And Feature Selection, Hakan Gündüz, Zehra Çataltepe, Yusuf Yaslan Jan 2017

Stock Daily Return Prediction Using Expanded Features And Feature Selection, Hakan Gündüz, Zehra Çataltepe, Yusuf Yaslan

Turkish Journal of Electrical Engineering and Computer Sciences

Stock market prediction is a very noisy problem and the use of any additional information to increase accuracy is necessary. In this paper, for the stock daily return prediction problem, the set of features is expanded to include indicators not only for the stock to be predicted itself but also a set of other stocks and currencies. Afterwards, different feature selection and classification methods are utilized for prediction. The daily close returns of the 3 most traded stocks (GARAN, THYAO, and ISCTR) in Borsa İstanbul (BIST) are predicted using indicators computed on those stocks, indicators for all the other stocks …


Exploring Feature Sets For Turkish Word Sense Disambiguation, Bahar İlgen, Eşref Adali, Ahmet Cüneyd Tantuğ Jan 2016

Exploring Feature Sets For Turkish Word Sense Disambiguation, Bahar İlgen, Eşref Adali, Ahmet Cüneyd Tantuğ

Turkish Journal of Electrical Engineering and Computer Sciences

This paper presents an exploration and evaluation of a diverse set of features that influence word-sense disambiguation (WSD) performance. WSD has the potential to improve many natural language processing (NLP) tasks as being one of the most crucial steps in the area. It is known that exploiting effective features and removing redundant ones help improving the results. There are two groups of feature sets to disambiguate senses and select the most appropriate ones among a set of candidates: collocational and bag-of-words (BoW) features. We introduce the effects of using these two feature sets on the Turkish Lexical Sample Dataset (TLSD), …


Feature Selection For Movie Recommendation, Zehra Çataltepe, Mahi̇ye Uluyağmur, Esengül Tayfur Jan 2016

Feature Selection For Movie Recommendation, Zehra Çataltepe, Mahi̇ye Uluyağmur, Esengül Tayfur

Turkish Journal of Electrical Engineering and Computer Sciences

TV users have an abundance of different movies they could choose from, and with the quantity and quality of data available both on user behavior and content, better recommenders are possible. In this paper, we evaluate and combine different content-based and collaborative recommendation methods for a Turkish movie recommendation system. Our recommendation methods can make use of user behavior, different types of content features, and other users' behavior to predict movie ratings. We gather different types of data on movies, such as the description, actors, directors, year, and genre. We use natural language processing methods to convert the Turkish movie …