Open Access. Powered by Scholars. Published by Universities.®

Computer Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 35

Full-Text Articles in Computer Engineering

Text-To-Sql: A Methodical Review Of Challenges And Models, Ali Buğra Kanburoğlu, Faik Boray Tek May 2024

Text-To-Sql: A Methodical Review Of Challenges And Models, Ali Buğra Kanburoğlu, Faik Boray Tek

Turkish Journal of Electrical Engineering and Computer Sciences

This survey focuses on Text-to-SQL, automated translation of natural language queries into SQL queries. Initially, we describe the problem and its main challenges. Then, by following the PRISMA systematic review methodology, we survey the existing Text-to-SQL review papers in the literature. We apply the same method to extract proposed Text-to-SQL models and classify them with respect to used evaluation metrics and benchmarks. We highlight the accuracies achieved by various models on Text-to-SQL datasets and discuss execution-guided evaluation strategies. We present insights into model training times and implementations of different models. We also explore the availability of Text-to-SQL datasets in non-English …


Exploring The Impact Of Training Datasets On Turkish Stance Detection, Muhammed Sai̇d Zengi̇n, Berk Utku Yeni̇sey, Mücahi̇d Kutlu Nov 2023

Exploring The Impact Of Training Datasets On Turkish Stance Detection, Muhammed Sai̇d Zengi̇n, Berk Utku Yeni̇sey, Mücahi̇d Kutlu

Turkish Journal of Electrical Engineering and Computer Sciences

Stance detection has garnered considerable attention from researchers due to its broad range of applications, including fact-checking and social computing. While state-of-the-art stance detection models are usually based on supervised machine learning methods, their effectiveness is heavily reliant on the quality of training data. This problem is more prevalent in stance detection task because the stance of a text is intimately tied to the target under consideration. While numerous datasets exist for stance detection, determining their suitability for a specific target can be challenging. In this work, we focus on Turkish stance detection and explore the impact of training data …


Behind Derogatory Migrants' Terms For Venezuelan Migrants: Xenophobia And Sexism Identification With Twitter Data And Nlp, Joseph Martínez, Melissa Miller-Felton, Jose Padilla, Erika Frydenlund Apr 2023

Behind Derogatory Migrants' Terms For Venezuelan Migrants: Xenophobia And Sexism Identification With Twitter Data And Nlp, Joseph Martínez, Melissa Miller-Felton, Jose Padilla, Erika Frydenlund

Modeling, Simulation and Visualization Student Capstone Conference

The sudden arrival of many migrants can present new challenges for host communities and create negative attitudes that reflect that tension. In the case of Colombia, with the influx of over 2.5 million Venezuelan migrants, such tensions arose. Our research objective is to investigate how those sentiments arise in social media. We focused on monitoring derogatory terms for Venezuelans, specifically veneco and veneca. Using a dataset of 5.7 million tweets from Colombian users between 2015 and 2021, we determined the proportion of tweets containing those terms. We observed a high prevalence of xenophobic and defamatory language correlated with the …


Solving Turkish Math Word Problems By Sequence-To-Sequence Encoder-Decoder Models, Esi̇n Gedi̇k, Tunga Güngör Mar 2023

Solving Turkish Math Word Problems By Sequence-To-Sequence Encoder-Decoder Models, Esi̇n Gedi̇k, Tunga Güngör

Turkish Journal of Electrical Engineering and Computer Sciences

Solving math word problems (MWP) is a challenging task due to the semantic gap between natural language texts and mathematical equations. The main purpose of the task is to take a written math problem as input and produce a proper equation as output for solving that problem. This paper describes a sequence-to-sequence (seq2seq) neural model for automatically solving Turkish MWPs based on their semantic meanings in the text. It comprises a bidirectional encoder to comprehend the semantics of the problem by encoding the input sequence and a decoder with attention to extract the equation by tracking the semantic meanings of …


A Structure-Aware Generative Adversarial Network For Bilingual Lexicon Induction, Bocheng Han, Qian Tao, Lusi Li, Zhihao Xiong Jan 2023

A Structure-Aware Generative Adversarial Network For Bilingual Lexicon Induction, Bocheng Han, Qian Tao, Lusi Li, Zhihao Xiong

Computer Science Faculty Publications

Bilingual lexicon induction (BLI) is the task of inducing word translations with a learned mapping function that aligns monolingual word embedding spaces in two different languages. However, most previous methods treat word embeddings as isolated entities and fail to jointly consider both the intra-space and inter-space topological relations between words. This limitation makes it challenging to align words from embedding spaces with distinct topological structures, especially when the assumption of isomorphism may not hold. To this end, we propose a novel approach called the Structure-Aware Generative Adversarial Network (SA-GAN) model to explicitly capture multiple topological structure information to achieve accurate …


A Structured Narrative Prompt For Prompting Narratives From Large Language Models: Sentiment Assessment Of Chatgpt-Generated Narratives And Real Tweets, Christopher J. Lynch, Erik J. Jensen, Virginia Zamponi, Kevin O'Brien, Erika Frydenlund, Ross Gore Jan 2023

A Structured Narrative Prompt For Prompting Narratives From Large Language Models: Sentiment Assessment Of Chatgpt-Generated Narratives And Real Tweets, Christopher J. Lynch, Erik J. Jensen, Virginia Zamponi, Kevin O'Brien, Erika Frydenlund, Ross Gore

VMASC Publications

Large language models (LLMs) excel in providing natural language responses that sound authoritative, reflect knowledge of the context area, and can present from a range of varied perspectives. Agent-based models and simulations consist of simulated agents that interact within a simulated environment to explore societal, social, and ethical, among other, problems. Simulated agents generate large volumes of data and discerning useful and relevant content is an onerous task. LLMs can help in communicating agents' perspectives on key life events by providing natural language narratives. However, these narratives should be factual, transparent, and reproducible. Therefore, we present a structured narrative prompt …


Data-Driven Strategies For Disease Management In Patients Admitted For Heart Failure, Ankita Agarwal Jan 2023

Data-Driven Strategies For Disease Management In Patients Admitted For Heart Failure, Ankita Agarwal

Browse all Theses and Dissertations

Heart failure is a syndrome which effects a patient’s quality of life adversely. It can be caused by different underlying conditions or abnormalities and involves both cardiovascular and non-cardiovascular comorbidities. Heart failure cannot be cured but a patient’s quality of life can be improved by effective treatment through medicines and surgery, and lifestyle management. As effective treatment of heart failure incurs cost for the patients and resource allocation for the hospitals, predicting length of stay of these patients during each hospitalization becomes important. Heart failure can be classified into two types: left sided heart failure and right sided heart failure. …


Comparative Adjudication Of Noisy And Subjective Data Annotation Disagreements For Deep Learning, Scott David Williams Jan 2023

Comparative Adjudication Of Noisy And Subjective Data Annotation Disagreements For Deep Learning, Scott David Williams

Browse all Theses and Dissertations

Obtaining accurate inferences from deep neural networks is difficult when models are trained on instances with conflicting labels. Algorithmic recognition of online hate speech illustrates this. No human annotator is perfectly reliable, so multiple annotators evaluate and label online posts in a corpus. Labeling scheme limitations, differences in annotators' beliefs, and limits to annotators' honesty and carefulness cause some labels to disagree. Consequently, decisive and accurate inferences become less likely. Some practical applications such as social research can tolerate some indecisiveness. However, an online platform using an indecisive classifier for automated content moderation could create more problems than it solves. …


Identification Of Factors Contributing To Traffic Crashes By Analysis Of Text Narratives, Cristian D. Arteaga-Sanchez Dec 2022

Identification Of Factors Contributing To Traffic Crashes By Analysis Of Text Narratives, Cristian D. Arteaga-Sanchez

UNLV Theses, Dissertations, Professional Papers, and Capstones

The fatalities, injuries, and property damage that result from traffic crashes impose a significant burden on society. Current research and practice in traffic safety rely on analysis of quantitative data from crash reports to understand crash severity contributors and develop countermeasures. Despite advances from this effort, quantitative crash data suffers from drawbacks, such as the limited ability to capture all the information relevant to the crashes and the potential errors introduced during data collection. Crash narratives can help address these limitations, as they contain detailed descriptions of the context and sequence of events of the crash. However, the unstructured nature …


Diacritics Correction In Turkish With Context-Aware Sequence To Sequence Modeling, Asi̇ye Tuba Özge, Özge Bozal, Umut Özge Sep 2022

Diacritics Correction In Turkish With Context-Aware Sequence To Sequence Modeling, Asi̇ye Tuba Özge, Özge Bozal, Umut Özge

Turkish Journal of Electrical Engineering and Computer Sciences

Digital texts in many languages have examples of missing or misused diacritics which makes it hard for natural language processing applications to disambiguate the meaning of words. Therefore, diacritics restoration is a crucial step in natural language processing applications for many languages. In this study we approach this problem as bidirectional transformation of diacritical letters and their ASCII counterparts, rather than unidirectional diacritic restoration. We propose a context-aware character-level sequence to sequence model for this transformation. The model is language independent in the sense that no language-specific feature extraction is necessary other than the utilization of word embeddings and is …


Applied Deep Learning: Case Studies In Computer Vision And Natural Language Processing, Md Reshad Ul Hoque Aug 2022

Applied Deep Learning: Case Studies In Computer Vision And Natural Language Processing, Md Reshad Ul Hoque

Electrical & Computer Engineering Theses & Dissertations

Deep learning has proved to be successful for many computer vision and natural language processing applications. In this dissertation, three studies have been conducted to show the efficacy of deep learning models for computer vision and natural language processing. In the first study, an efficient deep learning model was proposed for seagrass scar detection in multispectral images which produced robust, accurate scars mappings. In the second study, an arithmetic deep learning model was developed to fuse multi-spectral images collected at different times with different resolutions to generate high-resolution images for downstream tasks including change detection, object detection, and land cover …


Event-Related Microblog Retrieval In Turkish, Çağri Toraman Mar 2022

Event-Related Microblog Retrieval In Turkish, Çağri Toraman

Turkish Journal of Electrical Engineering and Computer Sciences

Microblogs, such as tweets, are short messages in which users are able to share any opinion and information. Microblogs are mostly related to real-life events reported in news articles. Finding event-related microblogs is important to analyze online social networks and understand public opinion on events. However, finding such microblogs is a challenging task due to the dynamic nature of microblogs and their limited length. In this study, assuming that news articles are given as queries and microblogs as documents, we find event-related microblogs in Turkish. In order to represent news articles and microblogs, we examine encoding methods, namely traditional bag-of-words …


Evaluating Similarity Of Cross-Architecture Basic Blocks, Elijah L. Meyer Jan 2022

Evaluating Similarity Of Cross-Architecture Basic Blocks, Elijah L. Meyer

Browse all Theses and Dissertations

Vulnerabilities in source code can be compiled for multiple processor architectures and make their way into several different devices. Security researchers frequently have no way to obtain this source code to analyze for vulnerabilities. Therefore, the ability to effectively analyze binary code is essential. Similarity detection is one facet of binary code analysis. Because source code can be compiled for different architectures, the need can arise for detecting code similarity across architectures. This need is especially apparent when analyzing firmware from embedded computing environments such as Internet of Things devices, where the processor architecture is dependent on the product and …


Computer Enabled Interventions To Communication And Behavioral Problems In Collaborative Work Environments, Ashutosh Shivakumar Jan 2022

Computer Enabled Interventions To Communication And Behavioral Problems In Collaborative Work Environments, Ashutosh Shivakumar

Browse all Theses and Dissertations

Task success in co-located and distributed collaborative work settings is characterized by clear and efficient communication between participating members. Communication issues like 1) Unwanted interruptions and 2) Delayed feedback in collaborative work based distributed scenarios have the potential to impede task coordination and significantly decrease the probability of accomplishing task objective. Research shows that 1) Interrupting tasks at random moments can cause users to take up to 30% longer to resume tasks, commit up to twice the errors, and experience up to twice the negative effect than when interrupted at boundaries 2) Skill retention in collaborative learning tasks improves with …


Semantically Meaningful Sentence Embeddings, Rojina Deuja Dec 2021

Semantically Meaningful Sentence Embeddings, Rojina Deuja

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Text embedding is an approach used in Natural Language Processing (NLP) to represent words, phrases, sentences, and documents. It is the process of obtaining numeric representations of text to feed into machine learning models as vectors (arrays of numbers). One of the biggest challenges in text embedding is representing longer text segments like sentences. These representations should capture the meaning of the segment and the semantic relationship between its constituents. Such representations are known as semantically meaningful embeddings. In this thesis, we seek to improve upon the quality of sentence embeddings that capture semantic information.

The current state-of-the-art models are …


Exploiting Bert And Roberta To Improve Performance For Aspect Based Sentiment Analysis, Gagan Reddy Narayanaswamy Jan 2021

Exploiting Bert And Roberta To Improve Performance For Aspect Based Sentiment Analysis, Gagan Reddy Narayanaswamy

Dissertations

Sentiment Analysis also known as opinion mining is a type of text research that analyses people’s opinions expressed in written language. Sentiment analysis brings together various research areas such as Natural Language Processing (NLP), Data Mining, and Text Mining, and is fast becoming of major importance to companies and organizations as it is started to incorporate online commerce data for analysis. Often the data on which sentiment analysis is performed will be reviews. The data can range from reviews of a small product to a big multinational corporation. The goal of performing sentiment analysis is to extract information from those …


Evaluating The Performance Of Transformer Architecture Over Attention Architecture On Image Captioning, Deepti Balasubramaniam Jan 2021

Evaluating The Performance Of Transformer Architecture Over Attention Architecture On Image Captioning, Deepti Balasubramaniam

Dissertations

Over the last few decades computer vision and Natural Language processing has shown tremendous improvement in different tasks such as image captioning, video captioning, machine translation etc using deep learning models. However, there were not much researches related to image captioning based on transformers and how it outperforms other models that were implemented for image captioning. In this study will be designing a simple encoder-decoder model, attention model and transformer model for image captioning using Flickr8K dataset where will be discussing about the hyperparameters of the model, type of pre-trained model used and how long the model has been trained. …


Human-Ai Teaming For Dynamic Interpersonal Skill Training, Xavian Alexander Ogletree Jan 2021

Human-Ai Teaming For Dynamic Interpersonal Skill Training, Xavian Alexander Ogletree

Browse all Theses and Dissertations

In almost every field, there is a need for strong interpersonal skills. This is especially true in fields such as medicine, psychology, and education. For instance, healthcare providers need to show understanding and compassion for LGBTQ+ and BIPOC (Black, Indigenous, and People of Color), or individuals with unique developmental or mental health needs. Improving interpersonal skills often requires first-person experience with expert evaluation and guidance to achieve proficiency. However, due to limited availability of assessment capabilities, professional standardized patients and instructional experts, students and professionals currently have inadequate opportunities for expert-guided training sessions. Therefore, this research aims to demonstrate leveraging …


Detecting And Correcting Automatic Speech Recognition Errors With A New Model, Recep Si̇nan Arslan, Necaatti̇n Barişçi, Nursal Arici, Sabri̇ Koçer Jan 2021

Detecting And Correcting Automatic Speech Recognition Errors With A New Model, Recep Si̇nan Arslan, Necaatti̇n Barişçi, Nursal Arici, Sabri̇ Koçer

Turkish Journal of Electrical Engineering and Computer Sciences

The purpose of automatic speech recognition (ASR) systems is to recognize speech signals obtained from people and convert them into text so that they can be processed by a computer. Although many ASR applications are versatile and widely used in the real world, they still generate relatively inaccurate results. They tend to generate spelling errors in recognized words, especially in noisy environments, in situations where the vocabulary size is increased, and at times when the input speech is of poor quality. The permanent presence of errors in ASR systems has led to the need to find alternative methods for automatic …


Survey On Deep Neural Networks In Speech And Vision Systems, M. Alam, Manar D. Samad, Lasitha Vidyaratne, ‪Alexander Glandon, Khan M. Iftekharuddin Dec 2020

Survey On Deep Neural Networks In Speech And Vision Systems, M. Alam, Manar D. Samad, Lasitha Vidyaratne, ‪Alexander Glandon, Khan M. Iftekharuddin

Computer Science Faculty Research

This survey presents a review of state-of-the-art deep neural network architectures, algorithms, and systems in speech and vision applications. Recent advances in deep artificial neural network algorithms and architectures have spurred rapid innovation and development of intelligent speech and vision systems. With availability of vast amounts of sensor data and cloud computing for processing and training of deep neural networks, and with increased sophistication in mobile and embedded technology, the next-generation intelligent systems are poised to revolutionize personal and commercial computing. This survey begins by providing background and evolution of some of the most successful deep learning models for intelligent …


Efficient Turkish Tweet Classification System For Crisis Response, Saed Alqaraleh, Merve Işik Jan 2020

Efficient Turkish Tweet Classification System For Crisis Response, Saed Alqaraleh, Merve Işik

Turkish Journal of Electrical Engineering and Computer Sciences

This paper presents a convolutional neural networks Turkish tweet classification system for crisis response. This system has the ability to classify the present information before or during any crisis. In addition, a preprocessing model was also implemented and integrated as a part of the developed system. This paper presents the first ever Turkish tweet dataset for crisis response, which can be widely used and improve similar studies. This dataset has been carefully preprocessed, annotated, and well organized. It is suitable to be used by all the well-known natural language processing tools. Extensive experimental work, using our produced Turkish tweet dataset …


Automated Labeling Of Terms In Medical Reports In Serbian, Aldina Avdic, Ulfeta Marovac, Dragan Jankovic Jan 2020

Automated Labeling Of Terms In Medical Reports In Serbian, Aldina Avdic, Ulfeta Marovac, Dragan Jankovic

Turkish Journal of Electrical Engineering and Computer Sciences

Nowadays, many electronic health reports (EHRs) are stored daily. They consist of the structured part and of an unstructured section written in natural language. Due to the limited time for medical examination, EHRs are short reports which often contain errors and abbreviations. Therefore it is a challenge to process an EHR and extract knowledge from this part of the text for different purposes. This paper compares the results of three proposed methods for automatic labeling of medical terms in unstructured parts of EHRs. All words are categorized as words within the medical domain (symptoms, diagnoses, therapies, anatomy, specialties etc.) and …


A Data Driven Approach To Identify Journalistic 5ws From Text Documents, Venkata Krishna Mohan Sunkara Jun 2019

A Data Driven Approach To Identify Journalistic 5ws From Text Documents, Venkata Krishna Mohan Sunkara

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Textual understanding is the process of automatically extracting accurate high-quality information from text. The amount of textual data available from different sources such as news, blogs and social media is growing exponentially. These data encode significant latent information which if extracted accurately can be valuable in a variety of applications such as medical report analyses, news understanding and societal studies. Natural language processing techniques are often employed to develop customized algorithms to extract such latent information from text.

Journalistic 5Ws refer to the basic information in news articles that describes an event and include where, when, who, what and why …


Automatic Concept Identification Of Software Requirements In Turkish, Fatma Bozyi̇ği̇t, Özlem Aktaş, Deni̇z Kilinç Jan 2019

Automatic Concept Identification Of Software Requirements In Turkish, Fatma Bozyi̇ği̇t, Özlem Aktaş, Deni̇z Kilinç

Turkish Journal of Electrical Engineering and Computer Sciences

Software requirements include description of the features for the target system and express the expectations of users. In the analysis phase, requirements are transformed into easy-to-understand conceptual models that facilitate communication between stakeholders. Although creating conceptual models using requirements is mostly implemented manually by analysts, the number of models that automate this process has increased recently. Most of the models and tools are developed to analyze requirements in English, and there is no study for agglutinative languages such as Turkish or Finnish. In this study, we propose an automatic concept identification model that transforms Turkish requirements into Unified Modeling Language …


A Hybrid Sentiment Analysis Method For Turkish, Buket Erşahi̇n, Özlem Aktaş, Deni̇z Kilinç, Mustafa Erşahi̇n Jan 2019

A Hybrid Sentiment Analysis Method For Turkish, Buket Erşahi̇n, Özlem Aktaş, Deni̇z Kilinç, Mustafa Erşahi̇n

Turkish Journal of Electrical Engineering and Computer Sciences

This paper presents a hybrid methodology for Turkish sentiment analysis, which combines the lexicon-based and machine learning (ML)-based approaches. On the lexicon-based side, we use a sentiment dictionary that is extended with a synonyms lexicon. Besides this, we tackle the classification problem with three supervised classifiers, naive Bayes, support vector machines, and J48, on the ML side. Our hybrid methodology combines these two approaches by generating a new lexicon-based value according to our feature generation algorithm and feeds it as one of the features to machine learning classifiers. Despite the linguistic challenges caused by the morphological structure of Turkish, the …


A Transfer Learning Approach For Sentiment Classification., Omar Abdelwahab Dec 2018

A Transfer Learning Approach For Sentiment Classification., Omar Abdelwahab

Electronic Theses and Dissertations

The idea of developing machine learning systems or Artificial Intelligence agents that would learn from different tasks and be able to accumulate that knowledge with time so that it functions successfully on a new task that it has not seen before is an idea and a research area that is still being explored. In this work, we will lay out an algorithm that allows a machine learning system or an AI agent to learn from k different domains then uses some or no data from the new task for the system to perform strongly on that new task. In order …


Measuring Goal Similarity Using Concept, Context And Task Features, Vahid Eyorokon Jan 2018

Measuring Goal Similarity Using Concept, Context And Task Features, Vahid Eyorokon

Browse all Theses and Dissertations

Goals can be described as the user's desired state of the agent and the world and are satisfied when the agent and the world are altered in such a way that the present state matches the desired state. For physical agents, they must act in the world to alter it in a series of individual atomic actions. Traditionally, agents use planning to create a chain of actions each of which altering the current world state and yielding a new one until the final action yields the desired goal state. Once this goal state has been achieved, the goal is said …


Implementing Universal Dependency, Morphology, And Multiword Expression Annotation Standards For Turkish Language Processing, Umut Sulubacak, Gülşen Eryi̇ği̇t Jan 2018

Implementing Universal Dependency, Morphology, And Multiword Expression Annotation Standards For Turkish Language Processing, Umut Sulubacak, Gülşen Eryi̇ği̇t

Turkish Journal of Electrical Engineering and Computer Sciences

Released only a year ago as the outputs of a research project (``Parsing Web 2.0 Sentences'', supported in part by a TÜBİTAK 1001 grant (No. 112E276) and a part of the ICT COST Action PARSEME (IC1207)), IMST and IWT are currently the most comprehensive Turkish dependency treebanks in the literature. This article introduces the final states of our treebanks, as well as a newly integrated hierarchical categorization of the multiheaded dependencies and their organization in an exclusive deep dependency layer in the treebanks. It also presents the adaptation of recent studies on standardizing multiword expression and named entity annotation schemes …


Relation Extraction Via One-Shot Dependency Parsing On Intersentential, Higher-Order, And Nested Relations, Gözde Gül Şahi̇n, Erdem Emekli̇gi̇l, Seçi̇l Arslan, Onur Ağin, Gülşen Eryi̇ği̇t Jan 2018

Relation Extraction Via One-Shot Dependency Parsing On Intersentential, Higher-Order, And Nested Relations, Gözde Gül Şahi̇n, Erdem Emekli̇gi̇l, Seçi̇l Arslan, Onur Ağin, Gülşen Eryi̇ği̇t

Turkish Journal of Electrical Engineering and Computer Sciences

Despite the emergence of digitalization, people still interact with institutions via traditional means such as submitting free formatted petitions, orders, or applications. These noisy documents generally consist of complex relations that are nested, higher-order, and intersentential. Most of the current approaches address extraction of only sentence-level and binary relations from grammatically correct text and generally require high-level linguistic features coming from preprocessors such as a parts-of-speech tagger, chunker, or syntactic parser. In this article, we focus on extracting complex relations in order to automate the task of understanding user intentions. We propose a novel language-agnostic and noise-immune approach that does …


Unsupervised Learning Of Allomorphs In Turkish, Burcu Can Jan 2017

Unsupervised Learning Of Allomorphs In Turkish, Burcu Can

Turkish Journal of Electrical Engineering and Computer Sciences

One morpheme may have several surface forms that correspond to allomorphs. In English, ed and $d$ are surface forms of the past tense morpheme, and $s$, es, and ies are surface forms of the plural or present tense morpheme. Turkish has a large number of allomorphs due to its morphophonemic processes. One morpheme can have tens of different surface forms in Turkish. This leads to a sparsity problem in natural language processing tasks in Turkish. Detection of allomorphs has not been studied much because of its difficulty. For example, tü and di are Turkish allomorphs (i.e. past tense morpheme), but …