Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 12 of 12

Full-Text Articles in Physical Sciences and Mathematics

Nonparametric Bayesian Deep Learning For Scientific Data Analysis, Devanshu Agrawal Dec 2020

Nonparametric Bayesian Deep Learning For Scientific Data Analysis, Devanshu Agrawal

Doctoral Dissertations

Deep learning (DL) has emerged as the leading paradigm for predictive modeling in a variety of domains, especially those involving large volumes of high-dimensional spatio-temporal data such as images and text. With the rise of big data in scientific and engineering problems, there is now considerable interest in the research and development of DL for scientific applications. The scientific domain, however, poses unique challenges for DL, including special emphasis on interpretability and robustness. In particular, a priority of the Department of Energy (DOE) is the research and development of probabilistic ML methods that are robust to overfitting and offer reliable …


Generating Adversarial Examples For Recruitment Ranking Algorithms, Anahita Samadi Dec 2020

Generating Adversarial Examples For Recruitment Ranking Algorithms, Anahita Samadi

Computer Science and Engineering Theses

There is no doubt that recruitment process plays an important role for both employers and applicants. Based on huge number of job candidates and open vacancies, recruitment process is expensive, time consuming and stressful for both applicants and companies. In today’s world so many recruitment processes are based on machine learning techniques. Therefore, it is very important to ensure security of these algorithms. Adversarial examples are proposed to examine vulnerability of machine leaning algorithms. Many research studies have been done on evaluating the resistance of artificial intelligence-based systems, in computer vision and text classification, against adversarial examples. However, to the …


Exploring Explicit And Implicit Feature Spaces In Natural Language Processing Using Self-Enrichment And Vector Space Analysis, Vincent Sippola Oct 2020

Exploring Explicit And Implicit Feature Spaces In Natural Language Processing Using Self-Enrichment And Vector Space Analysis, Vincent Sippola

Electronic Thesis and Dissertation Repository

Machine Learning in Natural Language Processing (NLP) deals directly with distributed representations of words and sentences. Words are transformed into vectors of real values, called embeddings, and used as the inputs to machine learning models. These architectures are then used to solve NLP tasks such as Sentiment Analysis and Natural Language Inference. While solving these tasks many models will create word embeddings and sentence embeddings as outputs. We are interested in how we can transform and analyze these output embeddings and modify our models, to both improve the task result and give us an understanding of the spaces. To this …


Enrichment Of Ontologies Using Machine Learning And Summarization, Hao Liu Aug 2020

Enrichment Of Ontologies Using Machine Learning And Summarization, Hao Liu

Dissertations

Biomedical ontologies are structured knowledge systems in biomedicine. They play a major role in enabling precise communications in support of healthcare applications, e.g., Electronic Healthcare Records (EHR) systems. Biomedical ontologies are used in many different contexts to facilitate information and knowledge management. The most widely used clinical ontology is the SNOMED CT. Placing a new concept into its proper position in an ontology is a fundamental task in its lifecycle of curation and enrichment.

A large biomedical ontology, which typically consists of many tens of thousands of concepts and relationships, can be viewed as a complex network with concepts as …


Mind Maps And Machine Learning: An Automation Framework For Qualitative Research In Entrepreneurship Education, Yasser Farha Aug 2020

Mind Maps And Machine Learning: An Automation Framework For Qualitative Research In Entrepreneurship Education, Yasser Farha

Dissertations

Entrepreneurship Education researchers often measure entrepreneurial motivation of college students. It is important for stakeholders, such as policymakers and educators, to assert if entrepreneurship education can encourage students to become entrepreneurs, as well as to understand factors that influence entrepreneurial motivation. For that purpose, researchers have used different methods and instruments to measure students' entrepreneurial motivation. Most of these methods are quantitative, e.g., closed-ended surveys, whereas qualitative methods, e.g., open-ended surveys, are rarely used.

Mind maps are an attractive qualitative survey tool because they capture the individual's reflections, thoughts, and experiences. For Entrepreneurship Education, mind maps can be utilized to …


Transfer Learning: Bridging The Gap Between Deep Learning And Domain-Specific Text Mining, Chaoran Cheng May 2020

Transfer Learning: Bridging The Gap Between Deep Learning And Domain-Specific Text Mining, Chaoran Cheng

Dissertations

Inspired by the success of deep learning techniques in Natural Language Processing (NLP), this dissertation tackles the domain-specific text mining problems for which the generic deep learning approaches would fail. More specifically, the domain-specific problems are: (1) success prediction in crowdfunding, (2) variants identification in biomedical literature, and (3) text data augmentation for domains with low-resources.

In the first part, transfer learning in a multimodal perspective is utilized to facilitate solving the project success prediction on the crowdfunding application. Even though the information in a project profile can be of different modalities such as text, images, and metadata, most existing …


Cross Language Information Transfer Between Modern Standard Arabic And Its Dialects – A Framework For Automatic Speech Recognition System Language Model, Tiba Zaki Abdulhameed Apr 2020

Cross Language Information Transfer Between Modern Standard Arabic And Its Dialects – A Framework For Automatic Speech Recognition System Language Model, Tiba Zaki Abdulhameed

Dissertations

Significant advances have been made with Modern Standard Arabic (MSA) Automatic Speech Recognition (ASR) applications. Yet, dialectal conversation ASR is still trailing behind due to limited language resources. As is the case in most cultures, the formal Modern Standard Arabic language is not used in daily life. Instead, varieties of regional dialects are spoken, which creates a dire need to address dialect ASR systems. Processing MSA language naturally poses considerable challenges that are passed on to the processing of its derived dialects. In dialects, many words have gradually morphed from MSA pronunciations and at many times have different usages. Also, …


Learning Latent Characteristics Of Data And Models Using Item Response Theory, John P. Lalor Mar 2020

Learning Latent Characteristics Of Data And Models Using Item Response Theory, John P. Lalor

Doctoral Dissertations

A supervised machine learning model is trained with a large set of labeled training data, and evaluated on a smaller but still large set of test data. Especially with deep neural networks (DNNs), the complexity of the model requires that an extremely large data set is collected to prevent overfitting. It is often the case that these models do not take into account specific attributes of the training set examples, but instead treat each equally in the process of model training. This is due to the fact that it is difficult to model latent traits of individual examples at the …


Automated Change Detection In Privacy Policies, Andrick Adhikari Jan 2020

Automated Change Detection In Privacy Policies, Andrick Adhikari

Electronic Theses and Dissertations

Privacy policies notify Internet users about the privacy practices of websites, mobile apps, and other products and services. However, users rarely read them and struggle to understand their contents. Also, the entities that provide these policies are sometimes unmotivated to make them comprehensible. Due to the complicated nature of these documents, it gets even harder for users to understand and take note of any changes of interest or concern when these policies are changed or revised.

With recent development of machine learning and natural language processing, tools that can automatically annotate sentences of policies have been developed. These annotations can …


Using Natural Language Processing To Categorize Fictional Literature In An Unsupervised Manner, Dalton J. Crutchfield Jan 2020

Using Natural Language Processing To Categorize Fictional Literature In An Unsupervised Manner, Dalton J. Crutchfield

Electronic Theses and Dissertations

When following a plot in a story, categorization is something that humans do without even thinking; whether this is simple classification like “This is science fiction” or more complex trope recognition like recognizing a Chekhov's gun or a rags to riches storyline, humans group stories with other similar stories. Research has been done to categorize basic plots and acknowledge common story tropes on the literary side, however, there is not a formula or set way to determine these plots in a story line automatically. This paper explores multiple natural language processing techniques in an attempt to automatically compare and cluster …


Text Mining Methods For Analyzing Online Health Information And Communication, Sifei Han Jan 2020

Text Mining Methods For Analyzing Online Health Information And Communication, Sifei Han

Theses and Dissertations--Computer Science

The Internet provides an alternative way to share health information. Specifically, social network systems such as Twitter, Facebook, Reddit, and disease specific online support forums are increasingly being used to share information on health related topics. This could be in the form of personal health information disclosure to seek suggestions or answering other patients' questions based on their history. This social media uptake gives a new angle to improve the current health communication landscape with consumer generated content from social platforms. With these online modes of communication, health providers can offer more immediate support to the people seeking advice. Non-profit …


Pseudo-Data Generation For Improving Clinical Named Entity Recognition, Jeffrey T. Smith Jan 2020

Pseudo-Data Generation For Improving Clinical Named Entity Recognition, Jeffrey T. Smith

Theses and Dissertations

One of the primary challenges for clinical Named Entity Recognition (NER) is the availability of annotated training data. Technical and legal hurdles prevent the creation and release of corpora related to electronic health records (EHRs). In this work, we look at the imapct of pseudo-data generation on clinical NER using gazetteering and thresholding utilizing a neural network model. We report that gazetteers can result in the inclusion of proper terms with the exclusion of determiners and pronouns in preceding and middle positions. Gazetteers that had higher numbers of terms inclusive to the original dataset had a higher impact. We also …