Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

2020

Master's Projects

Word2Vec

Articles 1 - 2 of 2

Full-Text Articles in Physical Sciences and Mathematics

Word Embedding Techniques For Malware Classification, Aniket Chandak May 2020

Word Embedding Techniques For Malware Classification, Aniket Chandak

Master's Projects

Word embeddings are often used in natural language processing as a means to quantify relationships between words. More generally, these same word embedding techniques can be used to quantify relationships between features. In this paper, we conduct a series of experiments that are designed to determine the effectiveness of word embedding in the context of malware classification. First, we conduct experiments where hidden Markov models (HMM) are directly applied to opcode sequences. These results serve to establish a baseline for comparison with our subsequent word embedding experiments. We then experiment with word embedding vectors derived from HMMs— a technique that …


Comparison Of Word2vec With Hash2vec For Machine Translation, Neha Gaikwad May 2020

Comparison Of Word2vec With Hash2vec For Machine Translation, Neha Gaikwad

Master's Projects

Machine Translation is the study of computer translation of a text written in one human language into text in a different language. Within this field, a word embedding is a mapping from terms in a language into small dimensional vectors which can be processed using mathematical operations. Two traditional word embedding approaches are word2vec, which uses a Neural Network, and hash2vec, which is based on a simpler hashing algorithm. In this project, we have explored the relative suitability of each approach to sequence to sequence text translation using a Recurrent Neural Network (RNN). We also carried out experiments to test …