Open Access. Powered by Scholars. Published by Universities.®

Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Series

Computer Sciences

PDF

Machine Learning

Institution
Publication Year
Publication

Articles 1 - 30 of 33

Full-Text Articles in Engineering

Machine Learning Prediction Of Hea Properties, Nicholas J. Beaver, Nathaniel Melisso, Travis Murphy Oct 2023

Machine Learning Prediction Of Hea Properties, Nicholas J. Beaver, Nathaniel Melisso, Travis Murphy

College of Engineering Summer Undergraduate Research Program

High-entropy alloys (HEA) are a very new development in the field of metallurgical materials. They are made up of multiple principle atoms unlike traditional alloys, which contributes to their high configurational entropy. The microstructure and properties of HEAs are are not well predicted with the models developed for more common engineering alloys, and there is not enough data available on HEAs to fully represent the complex behavior of these alloys. To that end, we explore how the use of machine learning models can be used to model the complex, high dimensional behavior in the HEA composition space. Based on our …


Generalization Through Diversity: Improving Unsupervised Environment Design, Wenjun Li, Pradeep Varakantham, Dexun Li Aug 2023

Generalization Through Diversity: Improving Unsupervised Environment Design, Wenjun Li, Pradeep Varakantham, Dexun Li

Research Collection School Of Computing and Information Systems

Agent decision making using Reinforcement Learning (RL) heavily relies on either a model or simulator of the environment (e.g., moving in an 8x8 maze with three rooms, playing Chess on an 8x8 board). Due to this dependence, small changes in the environment (e.g., positions of obstacles in the maze, size of the board) can severely affect the effectiveness of the policy learned by the agent. To that end, existing work has proposed training RL agents on an adaptive curriculum of environments (generated automatically) to improve performance on out-of-distribution (OOD) test scenarios. Specifically, existing research has employed the potential for the …


Chatgpt As Metamorphosis Designer For The Future Of Artificial Intelligence (Ai): A Conceptual Investigation, Amarjit Kumar Singh (Library Assistant), Dr. Pankaj Mathur (Deputy Librarian) Mar 2023

Chatgpt As Metamorphosis Designer For The Future Of Artificial Intelligence (Ai): A Conceptual Investigation, Amarjit Kumar Singh (Library Assistant), Dr. Pankaj Mathur (Deputy Librarian)

Library Philosophy and Practice (e-journal)

Abstract

Purpose: The purpose of this research paper is to explore ChatGPT’s potential as an innovative designer tool for the future development of artificial intelligence. Specifically, this conceptual investigation aims to analyze ChatGPT’s capabilities as a tool for designing and developing near about human intelligent systems for futuristic used and developed in the field of Artificial Intelligence (AI). Also with the helps of this paper, researchers are analyzed the strengths and weaknesses of ChatGPT as a tool, and identify possible areas for improvement in its development and implementation. This investigation focused on the various features and functions of ChatGPT that …


Drone Detection Using Yolov5, Burchan Aydin, Subroto Singha Feb 2023

Drone Detection Using Yolov5, Burchan Aydin, Subroto Singha

Faculty Publications

The rapidly increasing number of drones in the national airspace, including those for recreational and commercial applications, has raised concerns regarding misuse. Autonomous drone detection systems offer a probable solution to overcoming the issue of potential drone misuse, such as drug smuggling, violating people’s privacy, etc. Detecting drones can be difficult, due to similar objects in the sky, such as airplanes and birds. In addition, automated drone detection systems need to be trained with ample amounts of data to provide high accuracy. Real-time detection is also necessary, but this requires highly configured devices such as a graphical processing unit (GPU). …


Machine Learning Predictions Of Electricity Capacity, Marcus Harris, Elizabeth Kirby, Ameeta Agrawal, Rhitabrat Pokharel, Francis Puyleart, Martin Zwick Jan 2023

Machine Learning Predictions Of Electricity Capacity, Marcus Harris, Elizabeth Kirby, Ameeta Agrawal, Rhitabrat Pokharel, Francis Puyleart, Martin Zwick

Systems Science Faculty Publications and Presentations

This research applies machine learning methods to build predictive models of Net Load Imbalance for the Resource Sufficiency Flexible Ramping Requirement in the Western Energy Imbalance Market. Several methods are used in this research, including Reconstructability Analysis, developed in the systems community, and more well-known methods such as Bayesian Networks, Support Vector Regression, and Neural Networks. The aims of the research are to identify predictive variables and obtain a new stand-alone model that improves prediction accuracy and reduces the INC (ability to increase generation) and DEC (ability to decrease generation) Resource Sufficiency Requirements for Western Energy Imbalance Market participants. This …


Analyzing Ground Motion Records With Cvi Fuzzy Art, Dustin Tanksley, Xinzhe Yuan, Genda Chen, Donald C. Wunsch Jan 2023

Analyzing Ground Motion Records With Cvi Fuzzy Art, Dustin Tanksley, Xinzhe Yuan, Genda Chen, Donald C. Wunsch

Civil, Architectural and Environmental Engineering Faculty Research & Creative Works

This paper explores using Cluster Validity Indices Fuzzy Adaptative Resonance Theory (CVI Fuzzy ART) to cluster ground motion records (GMRs). Clustering the features extracted from a supervised network trained for predicting the structure damage results in less overfitting from the trained network. Using Cluster Validity Indices (CVIs) to evaluate the clustering gives feedback to how well the data is being classified, allowing further separation of the data. By using CVI Fuzzy ART in combination with features extracted from a trained Convolutional Neural Network (CNN), we were able to form additional clusters in the data. Within the primary clusters, accuracy was …


Learnfca: A Fuzzy Fca And Probability Based Approach For Learning And Classification, Suraj Ketan Samal Dec 2022

Learnfca: A Fuzzy Fca And Probability Based Approach For Learning And Classification, Suraj Ketan Samal

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Formal concept analysis(FCA) is a mathematical theory based on lattice and order theory used for data analysis and knowledge representation. Over the past several years, many of its extensions have been proposed and applied in several domains including data mining, machine learning, knowledge management, semantic web, software development, chemistry ,biology, medicine, data analytics, biology and ontology engineering.

This thesis reviews the state-of-the-art of theory of Formal Concept Analysis(FCA) and its various extensions that have been developed and well-studied in the past several years. We discuss their historical roots, reproduce the original definitions and derivations with illustrative examples. Further, we provide …


A Learning And Optimization Framework For Collaborative Urban Delivery Problems With Alliances, Jingfeng Yang, Hoong Chuin Lau Sep 2021

A Learning And Optimization Framework For Collaborative Urban Delivery Problems With Alliances, Jingfeng Yang, Hoong Chuin Lau

Research Collection School Of Computing and Information Systems

The emergence of e-Commerce imposes a tremendous strain on urban logistics which in turn raises concerns on environmental sustainability if not performed efficiently. While large logistics service providers (LSPs) can perform fulfillment sustainably as they operate extensive logistic networks, last-mile logistics are typically performed by small LSPs who need to form alliances to reduce delivery costs and improve efficiency, and to compete with large players. In this paper, we consider a multi-alliance multi-depot pickup and delivery problem with time windows (MAD-PDPTW) and formulate it as a mixed-integer programming (MIP) model. To cope with large-scale problem instances, we propose a two-stage …


Using Contextual Bandits To Improve Traffic Performance In Edge Network, Aziza Al Zadjali Aug 2021

Using Contextual Bandits To Improve Traffic Performance In Edge Network, Aziza Al Zadjali

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Edge computing network is a great candidate to reduce latency and enhance performance of the Internet. The flexibility afforded by Edge computing to handle data creates exciting range of possibilities. However, Edge servers have some limitations since Edge computing process and analyze partial sets of information. It is challenging to allocate computing and network resources rationally to satisfy the requirement of mobile devices under uncertain wireless network, and meet the constraints of datacenter servers too. To combat these issues, this dissertation proposes smart multi armed bandit algorithms that decide the appropriate connection setup for multiple network access technologies on the …


Real-Time Monitoring Of Fdm 3d Printer For Fault Detection Using Machine Learning: A Bibliometric Study, Vaibhav Kisan Kadam, Satish Kumar, Arunkumar Bongale May 2021

Real-Time Monitoring Of Fdm 3d Printer For Fault Detection Using Machine Learning: A Bibliometric Study, Vaibhav Kisan Kadam, Satish Kumar, Arunkumar Bongale

Library Philosophy and Practice (e-journal)

Additive Manufacturing has wide application range including healthcare, Fashion, Manufacturing, Prototypes, Tooling etc. AM techniques are subjected to various defects that may be printing defects or anomalies in machine. There is gap between current AM techniques and smart manufacturing since current AM lacks in build sensors necessary for process monitoring and fault detection. Both of these issues can be solved by incorporating real-time monitoring into AM. So the study is carried out to identify recent work done in AM to improve current system. For this bibliometric study Scopus database is used, study is kept limited to year 2010-2021 and English …


Characterizing Students’ Engineering Design Strategies Using Energy3d, Jasmine Singh, Viranga Perera, Alejandra Magana, Brittany Newell Apr 2021

Characterizing Students’ Engineering Design Strategies Using Energy3d, Jasmine Singh, Viranga Perera, Alejandra Magana, Brittany Newell

Discovery Undergraduate Interdisciplinary Research Internship

The goals of this study are to characterize design actions that students performed when solving a design challenge, and to create a machine learning model to help future students make better engineering design choices. We analyze data from an introductory engineering course where students used Energy3D, an open source computer-aided design software, to design a zero-energy home (i.e. a home that consumes no net energy over a period of a year). Student design actions within the software were recorded into text files. Using a sample of over 300 students, we first identify patterns in the data to assess how students …


Representational Learning Approach For Predicting Developer Expertise Using Eye Movements, Sumeet Maan Dec 2020

Representational Learning Approach For Predicting Developer Expertise Using Eye Movements, Sumeet Maan

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

The thesis analyzes an existing eye-tracking dataset collected while software developers were solving bug fixing tasks in an open-source system. The analysis is performed using a representational learning approach namely, Multi-layer Perceptron (MLP). The novel aspect of the analysis is the introduction of a new feature engineering method based on the eye-tracking data. This is then used to predict developer expertise on the data. The dataset used in this thesis is inherently more complex because it is collected in a very dynamic environment i.e., the Eclipse IDE using an eye-tracking plugin, iTrace. Previous work in this area only worked on …


Defense By Deception Against Stealthy Attacks In Power Grids, Md Hasan Shahriar Nov 2020

Defense By Deception Against Stealthy Attacks In Power Grids, Md Hasan Shahriar

FIU Electronic Theses and Dissertations

Cyber-physical Systems (CPSs) and the Internet of Things (IoT) are converging towards a hybrid platform that is becoming ubiquitous in all modern infrastructures. The integration of the complex and heterogeneous systems creates enormous space for the adversaries to get into the network and inject cleverly crafted false data into measurements, misleading the control center to make erroneous decisions. Besides, the attacker can make a critical part of the system unavailable by compromising the sensor data availability. To obfuscate and mislead the attackers, we propose DDAF, a deceptive data acquisition framework for CPSs' hierarchical communication network. Each switch in the hierarchical …


Forecasting Vegetation Health In The Mena Region By Predicting Vegetation Indicators With Machine Learning Models, Sachi Perera, Wenzhao Li, Erik Linstead, Hesham El-Askary Sep 2020

Forecasting Vegetation Health In The Mena Region By Predicting Vegetation Indicators With Machine Learning Models, Sachi Perera, Wenzhao Li, Erik Linstead, Hesham El-Askary

Mathematics, Physics, and Computer Science Faculty Articles and Research

Machine learning (ML) techniques can be applied to predict and monitor drought conditions due to climate change. Predicting future vegetation health indicators (such as EVI, NDVI, and LAI) is one approach to forecast drought events for hotspots (e.g. Middle East and North Africa (MENA) regions). Recently, ML models were implemented to predict EVI values using parameters such as land types, time series, historical vegetation indices, land surface temperature, soil moisture, evapotranspiration etc. In this work, we collected the MODIS atmospherically corrected surface spectral reflectance imagery with multiple vegetation related indices for modeling and evaluation of drought conditions in the MENA …


Evaluation Of Standard And Semantically-Augmented Distance Metrics For Neurology Patients, Daniel B. Hier, Jonathan Kopel, Steven U. Brint, Donald C. Wunsch, Gayla R. Olbricht, Sima Azizi, Blaine Allen Aug 2020

Evaluation Of Standard And Semantically-Augmented Distance Metrics For Neurology Patients, Daniel B. Hier, Jonathan Kopel, Steven U. Brint, Donald C. Wunsch, Gayla R. Olbricht, Sima Azizi, Blaine Allen

Electrical and Computer Engineering Faculty Research & Creative Works

Background: Patient distances can be calculated based on signs and symptoms derived from an ontological hierarchy. There is controversy as to whether patient distance metrics that consider the semantic similarity between concepts can outperform standard patient distance metrics that are agnostic to concept similarity. The choice of distance metric can dominate the performance of classification or clustering algorithms. Our objective was to determine if semantically augmented distance metrics would outperform standard metrics on machine learning tasks.

Methods: We converted the neurological findings from 382 published neurology cases into sets of concepts with corresponding machine-readable codes. We calculated patient distances by …


Routing Optimization In Heterogeneous Wireless Networks For Space And Mission-Driven Internet Of Things (Iot) Environments, Sara El Alaoui Aug 2020

Routing Optimization In Heterogeneous Wireless Networks For Space And Mission-Driven Internet Of Things (Iot) Environments, Sara El Alaoui

Department of Electrical and Computer Engineering: Dissertations, Theses, and Student Research

As technological advances have made it possible to build cheap devices with more processing power and storage, and that are capable of continuously generating large amounts of data, the network has to undergo significant changes as well. The rising number of vendors and variety in platforms and wireless communication technologies have introduced heterogeneity to networks compromising the efficiency of existing routing algorithms. Furthermore, most of the existing solutions assume and require connection to the backbone network and involve changes to the infrastructures, which are not always possible -- a 2018 report by the Federal Communications Commission shows that over 31% …


Advanced Techniques To Detect Complex Android Malware, Zhiqiang Li Apr 2020

Advanced Techniques To Detect Complex Android Malware, Zhiqiang Li

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Android is currently the most popular operating system for mobile devices in the world. However, its openness is the main reason for the majority of malware to be targeting Android devices. Various approaches have been developed to detect malware.

Unfortunately, new breeds of malware utilize sophisticated techniques to defeat malware detectors. For example, to defeat signature-based detectors, malware authors change the malware’s signatures to avoid detection. As such, a more effective approach to detect malware is by leveraging malware’s behavioral characteristics. However, if a behavior-based detector is based on static analysis, its reported results may contain a large number of …


An Examination Of The Smote And Other Smote-Based Techniques That Use Synthetic Data To Oversample The Minority Class In The Context Of Credit-Card Fraud Classification, Eduardo Parkinson De Castro Jan 2020

An Examination Of The Smote And Other Smote-Based Techniques That Use Synthetic Data To Oversample The Minority Class In The Context Of Credit-Card Fraud Classification, Eduardo Parkinson De Castro

Dissertations

This research project seeks to investigate some of the different sampling techniques that generate and use synthetic data to oversample the minority class as a means of handling the imbalanced distribution between non-fraudulent (majority class) and fraudulent (minority class) classes in a credit-card fraud dataset. The purpose of the research project is to assess the effectiveness of these techniques in the context of fraud detection which is a highly imbalanced and cost-sensitive dataset. Machine learning tasks that require learning from datasets that are highly unbalanced have difficulty learning since many of the traditional learning algorithms are not designed to cope …


Machine Learning Assisted Gait Analysis For The Determination Of Handedness In Able-Bodied People, Hugh Gallagher Jan 2020

Machine Learning Assisted Gait Analysis For The Determination Of Handedness In Able-Bodied People, Hugh Gallagher

Dissertations

This study has investigated the potential application of machine learning for video analysis, with a view to creating a system which can determine a person’s hand laterality (handedness) from the way that they walk (their gait). To this end, the convolutional neural network model VGG16 underwent transfer learning in order to classify videos under two ‘activities’: “walking left-handed” and “walking right-handed”. This saw varying degrees of success across five transfer learning trained models: Everything – the entire dataset; FiftyFifty – the dataset with enough right-handed samples removed to produce a set with parity between activities; Female – only the female …


Using Machine Learning Classification Methods To Detect The Presence Of Heart Disease, Nestor Pereira Dec 2019

Using Machine Learning Classification Methods To Detect The Presence Of Heart Disease, Nestor Pereira

Dissertations

Cardiovascular disease (CVD) is the most common cause of death in Ireland, and probably, worldwide. According to the Health Service Executive (HSE) cardiovascular disease accounting for 36% of all deaths, and one important fact, 22% of premature deaths (under age 65) are from CVD.

Using data from the Heart Disease UCI Data Set (UCI Machine Learning), we use machine learning techniques to detect the presence or absence of heart disease in the patient according to 14 features provide for this dataset. The different results are compared based on accuracy performance, confusion matrix and area under the Receiver Operating Characteristics (ROC) …


Factor Analysis Of Mixed Data (Famd) And Multiple Linear Regression In R, Nestor Pereira Dec 2019

Factor Analysis Of Mixed Data (Famd) And Multiple Linear Regression In R, Nestor Pereira

Dissertations

In the previous projects, it has been worked to statistically analysis of the factors to impact the score of the subjects of Mathematics and Portuguese for several groups of the student from secondary school from Portugal.

In this project will be interested in finding a model, hypothetically multiple linear regression, to predict the final score, dependent variable G3, of the student according to some features divide into two groups. One group, analyses the features or predictors which impact in the final score more related to the performance of the students, means variables like study time or past failures. The second …


Sensor Emulation With Physiolocal Data In Immersive Virtual Reality Driving Simulator, Jungsu Pak, Oliver Mathias, Ariane Guirguis, Uri Maoz Dec 2019

Sensor Emulation With Physiolocal Data In Immersive Virtual Reality Driving Simulator, Jungsu Pak, Oliver Mathias, Ariane Guirguis, Uri Maoz

Student Scholar Symposium Abstracts and Posters

Can we enhance the safety and comfort of AVs by training AVs with physiological data of human drivers? We will train and compare AV algorithm with/without physiological data.


Development Of An Autonomous Aerial Toolset For Agricultural Applications, Terrance Life Oct 2019

Development Of An Autonomous Aerial Toolset For Agricultural Applications, Terrance Life

Mahurin Honors College Capstone Experience/Thesis Projects

According to the United Nations, the world population is expected to grow from its current 7 billion to 9.7 billion by the year 2050. During this time, global food demand is also expected to increase by between 59% and 98% due to the population increase, accompanied by an increasing demand for protein due to a rising standard of living throughout developing countries. [1] Meeting this increase in required food production using present agricultural practices would necessitate a similar increase in farmland; a resource which does not exist in abundance. Therefore, in order to meet growing food demands, new methods will …


Exploring Age-Related Metamemory Differences Using Modified Brier Scores And Hierarchical Clustering, Chelsea Parlett-Pelleriti, Grace C. Lin, Masha R. Jones, Erik Linstead, Susanne M. Jaeggi Jan 2019

Exploring Age-Related Metamemory Differences Using Modified Brier Scores And Hierarchical Clustering, Chelsea Parlett-Pelleriti, Grace C. Lin, Masha R. Jones, Erik Linstead, Susanne M. Jaeggi

Engineering Faculty Articles and Research

Older adults (OAs) typically experience memory failures as they age. However, with some exceptions, studies of OAs’ ability to assess their own memory functions—Metamemory (MM)— find little evidence that this function is susceptible to age-related decline. Our study examines OAs’ and young adults’ (YAs) MM performance and strategy use. Groups of YAs (N = 138) and OAs (N = 79) performed a MM task that required participants to place bets on how likely they were to remember words in a list. Our analytical approach includes hierarchical clustering, and we introduce a new measure of MM—the modified Brier—in order to adjust …


Abso2luteu-Net: Tissue Oxygenation Calculation Using Photoacoustic Imaging And Convolutional Neural Networks, Kevin Hoffer-Hawlik, Geoffrey P. Luke Jan 2019

Abso2luteu-Net: Tissue Oxygenation Calculation Using Photoacoustic Imaging And Convolutional Neural Networks, Kevin Hoffer-Hawlik, Geoffrey P. Luke

ENGS 88 Honors Thesis (AB Students)

Photoacoustic (PA) imaging uses incident light to generate ultrasound signals within tissues. Using PA imaging to accurately measure hemoglobin concentration and calculate oxygenation (sO2) requires prior tissue knowledge and costly computational methods. However, this thesis shows that machine learning algorithms can accurately and quickly estimate sO2. absO2luteU-Net, a convolutional neural network, was trained on Monte Carlo simulated multispectral PA data and predicted sO2 with higher accuracy compared to simple linear unmixing, suggesting machine learning can solve the fluence estimation problem. This project was funded by the Kaminsky Family Fund and the Neukom Institute.


Predicting Violent Crime Reports From Geospatial And Temporal Attributes Of Us 911 Emergency Call Data, Vincent Corcoran Jan 2019

Predicting Violent Crime Reports From Geospatial And Temporal Attributes Of Us 911 Emergency Call Data, Vincent Corcoran

Dissertations

The aim of this study is to create a model to predict which 911 calls will result in crime reports of a violent nature. Such a prediction model could be used by the police to prioritise calls which are most likely to lead to violent crime reports. The model will use geospatial and temporal attributes of the call to predict whether a crime report will be generated. To create this model, a dataset of characteristics relating to the neighbourhood where the 911 call originated will be created and combined with characteristics related to the time of the 911 call. Geospatial …


Performance Comparison Of Hybrid Cnn-Svm And Cnn-Xgboost Models In Concrete Crack Detection, Sahana Thiyagarajan Jan 2019

Performance Comparison Of Hybrid Cnn-Svm And Cnn-Xgboost Models In Concrete Crack Detection, Sahana Thiyagarajan

Dissertations

Detection of cracks mainly has been a sort of essential step in visual inspection involved in construction engineering as it is the commonly used building material and cracks in them is an early sign of de-basement. It is hard to find cracks by a visual check for the massive structures. So, the development of crack detecting systems generally has been a critical issue. The utilization of contextual image processing in crack detection is constrained, as image data usually taken under real-world situations vary widely and also includes the complex modelling of cracks and the extraction of handcrafted features. Therefore the …


A Comprehensive Framework To Replicate Process-Level Concurrency Faults, Supat Rattanasuksun Nov 2018

A Comprehensive Framework To Replicate Process-Level Concurrency Faults, Supat Rattanasuksun

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Concurrency faults are one of the most damaging types of faults that can affect the dependability of today’s computer systems. Currently, concurrency faults such as process-level races, order violations, and atomicity violations represent the largest class of faults that has been reported to various Linux bug repositories. Clearly, existing approaches for testing such faults during software development processes are not adequate as these faults escape in-house testing efforts and are discovered during deployment and must be debugged.

The main reason concurrency faults are hard to test is because the conditions that allow these to occur can be difficult to replicate, …


Game-Theoretic And Machine-Learning Techniques For Cyber-Physical Security And Resilience In Smart Grid, Longfei Wei Oct 2018

Game-Theoretic And Machine-Learning Techniques For Cyber-Physical Security And Resilience In Smart Grid, Longfei Wei

FIU Electronic Theses and Dissertations

The smart grid is the next-generation electrical infrastructure utilizing Information and Communication Technologies (ICTs), whose architecture is evolving from a utility-centric structure to a distributed Cyber-Physical System (CPS) integrated with a large-scale of renewable energy resources. However, meeting reliability objectives in the smart grid becomes increasingly challenging owing to the high penetration of renewable resources and changing weather conditions. Moreover, the cyber-physical attack targeted at the smart grid has become a major threat because millions of electronic devices interconnected via communication networks expose unprecedented vulnerabilities, thereby increasing the potential attack surface. This dissertation is aimed at developing novel game-theoretic and …


Semantic Visualization For Short Texts With Word Embeddings, Van Minh Tuan Le, Hady W. Lauw Aug 2017

Semantic Visualization For Short Texts With Word Embeddings, Van Minh Tuan Le, Hady W. Lauw

Research Collection School Of Computing and Information Systems

Semantic visualization integrates topic modeling and visualization, such that every document is associated with a topic distribution as well as visualization coordinates on a low-dimensional Euclidean space. We address the problem of semantic visualization for short texts. Such documents are increasingly common, including tweets, search snippets, news headlines, or status updates. Due to their short lengths, it is difficult to model semantics as the word co-occurrences in such a corpus are very sparse. Our approach is to incorporate auxiliary information, such as word embeddings from a larger corpus, to supplement the lack of co-occurrences. This requires the development of a …