Open Access. Powered by Scholars. Published by Universities.®
Social and Behavioral Sciences Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Institution
-
- Syracuse University (10)
- Selected Works (5)
- Western University (4)
- City University of New York (CUNY) (1)
- Gonzaga University (1)
-
- Loyola University Chicago (1)
- Old Dominion University (1)
- SelectedWorks (1)
- United Arab Emirates University (1)
- University at Albany, State University of New York (1)
- University of Nebraska - Lincoln (1)
- University of South Carolina (1)
- University of Vermont (1)
- University of Wisconsin Milwaukee (1)
- Publication Year
- Publication
-
- School of Information Studies - Faculty Scholarship (9)
- Victoria Rubin (5)
- FIMS Publications (4)
- Computer Science Faculty Publications (1)
- Foley Library Scholarship (1)
-
- Graduate College Dissertations and Theses (1)
- Legacy Theses & Dissertations (2009 - 2024) (1)
- Library Philosophy and Practice (e-journal) (1)
- Mary G Kosta (1)
- Publications (1)
- Publications and Research (1)
- School of Business: Faculty Publications and Other Works (1)
- School of Information Studies - Dissertations (1)
- The Journal of Electronic Theses and Dissertations (1)
- Theses and Dissertations (1)
- Publication Type
- File Type
Articles 1 - 30 of 30
Full-Text Articles in Social and Behavioral Sciences
Can Chatgpt Accurately Answer A Picot Question? Assessing Ai Respones To A Clinical Question, Candise Branum, Martin Schiavenato
Can Chatgpt Accurately Answer A Picot Question? Assessing Ai Respones To A Clinical Question, Candise Branum, Martin Schiavenato
Foley Library Scholarship
Background:
ChatGPT, an artificial intelligence (AI) text generator trained to predict correct words, can provide answers to questions but has shown mixed results in answering medical questions.
Purpose:
To assess the reliability and accuracy of ChatGPT in providing answers to a complex clinical question.
Methods:
A Population, Intervention, Comparison, Outcome, and Time (PICOT) formatted question was queried, along with a request for references. Full-text articles were reviewed to verify the accuracy of the evidence summary provided by the chatbot.
Results:
ChatGPT was unable to provide a certifiable response to a PICOT question. The references cited as evidence included incorrect journal …
Creating Data From Unstructured Text With Context Rule Assisted Machine Learning (Craml), Stephen Meisenbacher, Peter Norlander
Creating Data From Unstructured Text With Context Rule Assisted Machine Learning (Craml), Stephen Meisenbacher, Peter Norlander
School of Business: Faculty Publications and Other Works
Popular approaches to building data from unstructured text come with limitations, such as scalability, interpretability, replicability, and real-world applicability. These can be overcome with Context Rule Assisted Machine Learning (CRAML), a method and no-code suite of software tools that builds structured, labeled datasets which are accurate and reproducible. CRAML enables domain experts to access uncommon constructs within a document corpus in a low-resource, transparent, and flexible manner. CRAML produces document-level datasets for quantitative research and makes qualitative classification schemes scalable over large volumes of text. We demonstrate that the method is useful for bibliographic analysis, transparent analysis of proprietary data, …
Implementing A Chatbot On A Library Website, Michelle Ehrenpreis, John P. Delooper
Implementing A Chatbot On A Library Website, Michelle Ehrenpreis, John P. Delooper
Publications and Research
A library’s website is a virtual point of contact for interacting with its patrons. Ensuring a library’s website has easily findable content is critical for providing access to library resources and highlighting services and events. One tool for assisting with content findability is a chatbot, a form of artificial intelligence software. In this case study, Lehman College's Leonard Lief Library implemented Ivy, a proprietary educational software chatbot on its website, the first of its kind for an academic library. This chatbot functioned as a new tool that assisted users seeking information and provided insight to librarians about the kinds of …
Defining And Detecting Toxicity On Social Media: Context And Knowledge Are Key, Amit Sheth, Valerie Shalin, Ugur Kursuncu
Defining And Detecting Toxicity On Social Media: Context And Knowledge Are Key, Amit Sheth, Valerie Shalin, Ugur Kursuncu
Publications
As the role of online platforms has become increasingly prominent for communication, toxic behaviors, such as cyberbullying and harassment, have been rampant in the last decade. On the other hand, online toxicity is multi-dimensional and sensitive in nature, which makes its detection challenging. As the impact of exposure to online toxicity can lead to serious implications for individuals and communities, reliable models and algorithms are required for detecting and understanding such communications. In this paper We define toxicity to provide a foundation drawing social theories. Then, we provide an approach that identifies multiple dimensions of toxicity and incorporates explicit knowledge …
Bibliometric Analysis Of Named Entity Recognition For Chemoinformatics And Biomedical Information Extraction Of Ovarian Cancer, Vijayshri Khedkar, Charlotte Fernandes, Devshi Desai, Mansi R, Gurunath Chavan Dr, Sonali Tidke Dr., M. Karthikeyan Dr.
Bibliometric Analysis Of Named Entity Recognition For Chemoinformatics And Biomedical Information Extraction Of Ovarian Cancer, Vijayshri Khedkar, Charlotte Fernandes, Devshi Desai, Mansi R, Gurunath Chavan Dr, Sonali Tidke Dr., M. Karthikeyan Dr.
Library Philosophy and Practice (e-journal)
With the massive amount of data that has been generated in the form of unstructured text documents, Biomedical Named Entity Recognition (BioNER) is becoming increasingly important in the field of biomedical research. Since currently there does not exist any automatic archiving of the obtained results, a lot of this information remains hidden in the textual details and is not easily accessible for further analysis. Hence, text mining methods and natural language processing techniques are used for the extraction of information from such publications.Named entity recognition, is a subtask that comes under information extraction that focuses on finding and categorizing specific …
Building And Using Digital Libraries For Etds, Edward A. Fox
Building And Using Digital Libraries For Etds, Edward A. Fox
The Journal of Electronic Theses and Dissertations
Despite the high value of electronic theses and dissertations (ETDs), the global collection has seen limited use. To extend such use, a new approach to building digital libraries (DLs) is needed. Fortunately, recent decades have seen that a vast amount of “gray literature” has become available through a diverse set of institutional repositories as well as regional and national libraries and archives. Most of the works in those collections include ETDs and are often freely available in keeping with the open-access movement, but such access is limited by the services of supporting information systems. As explained through a set of …
Quantifying Language Changes Surrounding Mental Health On Twitter, Anne Marie Stupinski
Quantifying Language Changes Surrounding Mental Health On Twitter, Anne Marie Stupinski
Graduate College Dissertations and Theses
Mental health challenges are thought to afflict around 10% of the global population each year, with many going untreated due to stigma and limited access to services. Here, we explore trends in words and phrases related to mental health through a collection of 1- , 2-, and 3-grams parsed from a data stream of roughly 10% of all English tweets since 2012. We examine temporal dynamics of mental health language, finding that the popularity of the phrase ‘mental health’ increased by nearly two orders of magnitude between 2012 and 2018. We observe that mentions of ‘mental health’ spike annually and …
Examining The Notion Of The Boundary Object In Information Systems: The Transdisciplinary Oeuvre Of Cognitive Science, Laura Elien Ridenour
Examining The Notion Of The Boundary Object In Information Systems: The Transdisciplinary Oeuvre Of Cognitive Science, Laura Elien Ridenour
Theses and Dissertations
This study examined the transdisciplinary area of cognitive science, and was framed around the sociological notion of the boundary object. Harmonizing theoretical and technical approaches, methods introduced in this work moved beyond qualitative study practices traditional to boundary object theory work to a mixed-methods data-driven approach. Bibliometric Web of Science data, enriched with National Science Foundation (NSF) journal classifications, formed the foundation from which a seed-and-expand dataset were created from journals containing the string cogni* and their cited articles for the years 2006-2016. This two-tiered dataset allowed for the analysis of boundary-spanning interdisciplinary concepts, as identified by noun phrases, and …
Automatic Slide Generation For Scientific Papers, Athar Sefid, Jian Wu, Prasenjit Mitra, C. Lee Giles
Automatic Slide Generation For Scientific Papers, Athar Sefid, Jian Wu, Prasenjit Mitra, C. Lee Giles
Computer Science Faculty Publications
We describe our approach for automatically generating presentation slides for scientific papers using deep neural networks. Such slides can help authors have a starting point for their slide generation process. Extractive summarization techniques are applied to rank and select important sentences from the original document. Previous work identified important sentences based only on a limited number of features that were extracted from the position and structure of sentences in the paper. Our method extends previous work by (1) extracting a more comprehensive list of surface features, (2) considering semantic or meaning of the sentence, and (3) using context around the …
Pragmatic And Cultural Considerations For Deception Detection In Asian Languages, Victoria L. Rubin
Pragmatic And Cultural Considerations For Deception Detection In Asian Languages, Victoria L. Rubin
Victoria Rubin
In hopes of sparking a discussion, I argue for much needed research on automated deception detection in Asian languages. The task of discerning truthful texts from deceptive ones is challenging, but a logical sequel to opinion mining. I suggest that applied computational linguists pursue broader interdisciplinary research on cultural differences and pragmatic use of language in Asian cultures, before turning to detection methods based on a primarily Western (English-centric) worldview. Deception is fundamentally human, but how do various cultures interpret and judge deceptive behavior?
Comparative Stylistic Fanfiction Analysis: Popular And Unpopular Fics Across Eleven Fandoms, Victoria L. Rubin, Vanessa Girouard
Comparative Stylistic Fanfiction Analysis: Popular And Unpopular Fics Across Eleven Fandoms, Victoria L. Rubin, Vanessa Girouard
Victoria Rubin
Abstract: This study analyses 545 sample fanfiction stories (fics) in their stylistic feature variation by popularity and across eleven ‘fandoms’ in creative writing forums. Lexical richness, average sentence and paragraph lengths are isolated as promising measures for a text classifier to use in predicting a fic’s likely popularity in its fandom. Résumé: Cette étude analyse un échantillon de 545 chapitres d‘œuvres de fanfiction (fics) selon leur variation stylistique et leur popularité dans onze ‘fandoms’ différents. La richesse lexicale, longueur moyenne de phrase et longueur moyenne de paragraphe ont été choisis comme traits stylistiques propres à différencier les fics populaires des …
Discerning Truth From Deception: Human Judgments And Automation Efforts, Victoria L. Rubin, Niall Conroy
Discerning Truth From Deception: Human Judgments And Automation Efforts, Victoria L. Rubin, Niall Conroy
Victoria Rubin
Recent improvements in effectiveness and accuracy of the emerging field of automated deception detection and the associated potential of language technologies have triggered increased interest in mass media and general public. Computational tools capable of alerting users to potentially deceptive content in computer–mediated messages are invaluable for supporting undisrupted, computer–mediated communication and information practices, credibility assessment and decision–making. The goal of this ongoing research is to inform creation of such automated capabilities. In this study we elicit a sample of 90 computer–mediated personal stories with varying levels of deception. Each story has 10 associated human deception level judgments, confidence scores, …
Bioont : Improving Knowledge Organization And Representation In The Domain Of Biometric Authentication, Stephen Bryan Buerle
Bioont : Improving Knowledge Organization And Representation In The Domain Of Biometric Authentication, Stephen Bryan Buerle
Legacy Theses & Dissertations (2009 - 2024)
This dissertation explores some of the fundamental challenges facing the information assurance community as it relates to knowledge categorization, organization and representation within the field of information security and more specifically within the domain of biometric authentication. A primary objective of this research is the development of a biometric authentication corpus and an empirically derived ontological prototype, which aids and promotes further research into the fundamental ontological structure of the field of biometric authentication. In doing so this research explores the use of automated and semi-supervised ontological engineering, corpus analysis and natural language processes techniques in the development of this …
Discerning Emotions In Texts, Victoria Rubin, Jeffrey Stanton, Elizabeth Liddy
Discerning Emotions In Texts, Victoria Rubin, Jeffrey Stanton, Elizabeth Liddy
Victoria Rubin
We present an empirically verified model of discernable emotions, Watson and Tellegen’s Circumplex Theory of Affect from social and personality psychology, and suggest its usefulness in NLP as a potential model for an automation of an eight-fold categorization of emotions in written English texts. We developed a data collection tool based on the model, collected 287 responses from 110 non-expert informants based on 50 emotional excerpts (min=12, max=348, average=86 words), and analyzed the inter-coder agreement per category and per strength of ratings per sub-category. The respondents achieved an average 70.7% agreement in the most commonly identified emotion categories per text. …
Certainty Identification In Texts: Categorization Model And Manual Tagging Results, Elizabeth Liddy, Victoria Rubin, Noriko Kando
Certainty Identification In Texts: Categorization Model And Manual Tagging Results, Elizabeth Liddy, Victoria Rubin, Noriko Kando
Victoria Rubin
This chapter presents a theoretical framework and preliminary results for manual categorization of explicit certainty information in 32 English newspaper articles. Our contribution is in a proposed categorization model and analytical framework for certainty identification. Certainty is presented as a type of subjective information available in texts. Statements with explicit certainty markers were identified and categorized according to four hypothesized dimensions – level, perspective, focus, and time of certainty. The preliminary results reveal an overall promising picture of the presence of certainty information in texts, and establish its susceptibility to manual identification within the proposed four-dimensional certainty categorization analytical framework. …
Pragmatic And Cultural Considerations For Deception Detection In Asian Languages, Victoria L. Rubin
Pragmatic And Cultural Considerations For Deception Detection In Asian Languages, Victoria L. Rubin
FIMS Publications
In hopes of sparking a discussion, I argue for much needed research on automated deception detection in Asian languages. The task of discerning truthful texts from deceptive ones is challenging, but a logical sequel to opinion mining. I suggest that applied computational linguists pursue broader interdisciplinary research on cultural differences and pragmatic use of language in Asian cultures, before turning to detection methods based on a primarily Western (English-centric) worldview. Deception is fundamentally human, but how do various cultures interpret and judge deceptive behavior?
Veracity Roadmap: Is Big Data Objective, Truthful And Credible?, Victoria Rubin, Tatiana Lukoianova
Veracity Roadmap: Is Big Data Objective, Truthful And Credible?, Victoria Rubin, Tatiana Lukoianova
FIMS Publications
This paper argues that big data can possess different characteristics, which affect its quality. Depending on its origin, data processing technologies, and methodologies used for data collection and scientific discoveries, big data can have biases, ambiguities, and inaccuracies which need to be identified and accounted for to reduce inference errors and improve the accuracy of generated insights. Big data veracity is now being recognized as a necessary property for its utilization, complementing the three previously established quality dimensions (volume, variety, and velocity), But there has been little discussion of the concept of veracity thus far. This paper provides a roadmap …
Using Ontology-Based Approaches To Representing Speech Transcripts For Automated Speech Scoring, Miao Chen
Using Ontology-Based Approaches To Representing Speech Transcripts For Automated Speech Scoring, Miao Chen
School of Information Studies - Dissertations
Text representation is a process of transforming text into some formats that computer systems can use for subsequent information-related tasks such as text classification. Representing text faces two main challenges: meaningfulness of representation and unknown terms. Research has shown evidence that these challenges can be resolved by using the rich semantics in ontologies. This study aims to address these challenges by using ontology-based representation and unknown term reasoning approaches in the context of content scoring of speech, which is a less explored area compared to some common ones such as categorizing text corpus (e.g. 20 newsgroups and Reuters).
From the …
Comparative Stylistic Fanfiction Analysis: Popular And Unpopular Fics Across Eleven Fandoms, Victoria L. Rubin, Vanessa Girouard
Comparative Stylistic Fanfiction Analysis: Popular And Unpopular Fics Across Eleven Fandoms, Victoria L. Rubin, Vanessa Girouard
FIMS Publications
Abstract: This study analyses 545 sample fanfiction stories (fics) in their stylistic feature variation by popularity and across eleven ‘fandoms’ in creative writing forums. Lexical richness, average sentence and paragraph lengths are isolated as promising measures for a text classifier to use in predicting a fic’s likely popularity in its fandom. Résumé: Cette étude analyse un échantillon de 545 chapitres d‘œuvres de fanfiction (fics) selon leur variation stylistique et leur popularité dans onze ‘fandoms’ différents. La richesse lexicale, longueur moyenne de phrase et longueur moyenne de paragraphe ont été choisis comme traits stylistiques propres à différencier les fics populaires des …
Discerning Truth From Deception: Human Judgments And Automation Efforts, Victoria L. Rubin, Niall Conroy
Discerning Truth From Deception: Human Judgments And Automation Efforts, Victoria L. Rubin, Niall Conroy
FIMS Publications
Recent improvements in effectiveness and accuracy of the emerging field of automated deception detection and the associated potential of language technologies have triggered increased interest in mass media and general public. Computational tools capable of alerting users to potentially deceptive content in computer–mediated messages are invaluable for supporting undisrupted, computer–mediated communication and information practices, credibility assessment and decision–making. The goal of this ongoing research is to inform creation of such automated capabilities. In this study we elicit a sample of 90 computer–mediated personal stories with varying levels of deception. Each story has 10 associated human deception level judgments, confidence scores, …
Computer Assisted Language Learning, Mary G. Kosta
Computer Assisted Language Learning, Mary G. Kosta
Mary G Kosta
This paper will review natural language processing (NLP) and artificial intelligence in CALL. We will begin by surveying how these technologies can be used to teach the different language skills: speaking, listening, lexis, syntax, semantics and literacy. Examples of various NLP and ICALL systems will be provided, and ideal systems will be discussed for evaluation purposes. We will look at how NLP and ICALL systems can be incorporated in different instructional methods, and address issues with respect to conflicting teaching approaches. Next, we will discuss the effectiveness of CALL in language teaching and learning, and developmental problems facing NLP and …
A Longitudinal Study Of Language And Ideology In Congress, Bei Yu, Daniel Diermeier
A Longitudinal Study Of Language And Ideology In Congress, Bei Yu, Daniel Diermeier
School of Information Studies - Faculty Scholarship
This paper presents an analysis of the legislative speech records from the 101st-108th U.S. Congresses using machine learning and natural language processing methods. We use word vectors to represent the speeches in both the Senate and the House, and then use text categorization methods to classify the speakers by their ideological positions. The classification accuracy indicates the level of distinction between the liberal and the conservative ideologies. Our experiment results demonstrate an increasing partisanship in the Congress between 1989 and 2006. Ideology classifiers trained on the House speeches can predict the Senators' ideological positions well (House-to-Senate prediction), however the Senate-to-House …
Certainty Identification In Texts: Categorization Model And Manual Tagging Results, Elizabeth D. Liddy, Victoria L. Rubin, Noriko Kando
Certainty Identification In Texts: Categorization Model And Manual Tagging Results, Elizabeth D. Liddy, Victoria L. Rubin, Noriko Kando
School of Information Studies - Faculty Scholarship
This chapter presents a theoretical framework and preliminary results for manual categorization of explicit certainty information in 32 English newspaper articles. Our contribution is in a proposed categorization model and analytical framework for certainty identification. Certainty is presented as a type of subjective information available in texts. Statements with explicit certainty markers were identified and categorized according to four hypothesized dimensions – level, perspective, focus, and time of certainty.
The preliminary results reveal an overall promising picture of the presence of certainty information in texts, and establish its susceptibility to manual identification within the proposed four-dimensional certainty categorization analytical framework. …
Hands-On Nlp For An Interdisciplinary Audience, Elizabeth D. Liddy, Nancy Mccracken
Hands-On Nlp For An Interdisciplinary Audience, Elizabeth D. Liddy, Nancy Mccracken
School of Information Studies - Faculty Scholarship
The need for a single NLP offering for a diverse mix of graduate students (including computer scientists, information scientists, and linguists) has motivated us to develop a course that provides students with a breadth of understanding of the scope of real world applications, as well as depth of knowledge of the computational techniques on which to build in later experiences. We describe the three hands-on tasks for the course that have proven successful, namely: 1) in-class group simulations of computational processes; 2) team posters and public presentations on state-of-the-art commercial NLP applications, and; 3) team projects implementing various levels of …
Discerning Emotions In Texts, Victoria L. Rubin, Jeffrey M. Stanton, Elizabeth D. Liddy
Discerning Emotions In Texts, Victoria L. Rubin, Jeffrey M. Stanton, Elizabeth D. Liddy
School of Information Studies - Faculty Scholarship
We present an empirically verified model of discernable emotions, Watson and Tellegen’s Circumplex Theory of Affect from social and personality psychology, and suggest its usefulness in NLP as a potential model for an automation of an eight-fold categorization of emotions in written English texts. We developed a data collection tool based on the model, collected 287 responses from 110 non-expert informants based on 50 emotional excerpts (min=12, max=348, average=86 words), and analyzed the inter-coder agreement per category and per strength of ratings per sub-category. The respondents achieved an average 70.7% agreement in the most commonly identified emotion categories per text. …
What Do You Mean? Finding Answers To Complex Questions, Anne R. Diekema, Ozgur Yilmazel, Jiangping Chen, Sarah Harwell, Elizabeth D. Liddy, Lan He
What Do You Mean? Finding Answers To Complex Questions, Anne R. Diekema, Ozgur Yilmazel, Jiangping Chen, Sarah Harwell, Elizabeth D. Liddy, Lan He
School of Information Studies - Faculty Scholarship
This paper illustrates ongoing research and issues faced when dealing with real-time questions in the domain of Reusable Launch Vehicles (aerospace engineering). The question- answering system described in this paper is used in a collaborative learning environment with real users and live questions. The paper describes an analysis of these more complex questions as well as research to include the user in the question-answering process by implementing a question negotiation module based on the traditional reference interview.
A Breadth Of Nlp Applications, Elizabeth D. Liddy
A Breadth Of Nlp Applications, Elizabeth D. Liddy
School of Information Studies - Faculty Scholarship
The Center for Natural Language Processing (CNLP) was founded in September 1999 in the School of Information Studies, the “Original Information School”, at Syracuse University. CNLP’s mission is to advance the development of human-like, language understanding software capabilities for government, commercial, and consumer applications. The Center conducts both basic and applied research, building on its recognized capabilities in Natural Language Processing. The Center’s seventeen employees are a mix of doctoral students in information science or computer engineering, software engineers, linguistic analysts, and research engineers.
Natural Language Processing, Elizabeth D. Liddy
Natural Language Processing, Elizabeth D. Liddy
School of Information Studies - Faculty Scholarship
Natural Language Processing (NLP) is the computerized approach to analyzing text that is based on both a set of theories and a set of technologies. And, being a very active area of research and development, there is not a single agreed-upon definition that would satisfy everyone, but there are some aspects, which would be part of any knowledgeable person’s definition.
An Nlp Approach For Improving Access To Statistical Information For The Masses, Elizabeth D. Liddy, Jennifer H. Liddy
An Nlp Approach For Improving Access To Statistical Information For The Masses, Elizabeth D. Liddy, Jennifer H. Liddy
School of Information Studies - Faculty Scholarship
Naïve users need to access statistical information, but frequently do not have the sophisticated levels of understanding required in order to translate their information needs into the structure and vocabulary of sites which currently provide access to statistical information. However, these users can articulate quite straightforwardly in their own terms what they are looking for. One approach to satisfying the masses of citizens with needs for statistical information is to automatically map their natural language expressions of their information needs into the metadata structure and terminology that defines and describes the content of statistical tables. To accomplish this goal, we …
Searching And Search Engines: When Is Current Research Going To Lead To Major Progress?, Elizabeth D. Liddy
Searching And Search Engines: When Is Current Research Going To Lead To Major Progress?, Elizabeth D. Liddy
School of Information Studies - Faculty Scholarship
For many years, users of commercial search engines have been hearing how the latest in information and computer science research is going to improve the quality of the engines they rely on for fulfilling their daily information needs. However, despite what is heard, these promises have not been fulfilled. While the Internet has dramatically increased the amount of information to which users now have access, the key issue appears to be unresolved – the results for substantive queries are not improving. However, the past need not predict the future because sophisticated advances in Natural Language Processing (NLP) have, in fact, …