Open Access. Powered by Scholars. Published by Universities.®

Social and Behavioral Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 14 of 14

Full-Text Articles in Social and Behavioral Sciences

A Psychometric Analysis Of Natural Language Inference Using Transformer Language Models, Antonio Laverghetta Jr. Oct 2023

A Psychometric Analysis Of Natural Language Inference Using Transformer Language Models, Antonio Laverghetta Jr.

USF Tampa Graduate Theses and Dissertations

Large language models (LLMs) are poised to transform both academia and industry. But the excitement around these generative AIs has also been met with concern for the true extent of their capabilities. This dissertation helps to address these questions by examining the capabilities of LLMs using the tools of psychometrics. We focus on analyzing the capabilities of LLMs on the task of natural language inference (NLI), a foundational benchmark often used to evaluate new models. We demonstrate that LLMs can reliably predict the psychometric properties of NLI items were those items administered to humans. Through a series of experiments, we …


Towards More Task-Generalized And Explainable Ai Through Psychometrics, Alec Braynen Nov 2022

Towards More Task-Generalized And Explainable Ai Through Psychometrics, Alec Braynen

USF Tampa Graduate Theses and Dissertations

In this work, we propose that adopting the methods, principles, and guidelines of the field of psychometrics can help the Artificial Intelligence (AI) community to build more task-generalizable and explainable AI. Three arguments are presented and explored. These arguments are that psychometrics can help by providing 1) a framework for formulating better datasets, 2) psychometric AI data that can lead to models of generalization in AI, and 3) explainable AI through more informative evaluations.

A review of psychometrics and psychological generalization is performed, along with an overview of evaluation, generalization, and explainability in AI. Various ideas are presented throughout for …


Prevalence And Predictors Of Careless Responding In Experience Sampling Research, Alexander J. Denison Jun 2022

Prevalence And Predictors Of Careless Responding In Experience Sampling Research, Alexander J. Denison

USF Tampa Graduate Theses and Dissertations

In the current study we examine the prevalence and several predictors of careless responding to an experience sampling (ESM) study. While careless responding has been noted as a potential problem in ESM research, few studies have examined the prevalence of this behavior (Beal, 2015; Berkel et al., 2017; Eisele et al., 2020; Gabriel et al., 2019; Jaso et al., 2021). Using statistical methods of careless response classification, we derive cut scores from data simulation and graphical examination of item correlations, and flag 44.98% of response episodes as careless. A majority of these flagged episodes were the product of overly consistent …


Creating A Short, Public-Domain Version Of The Cpai-2: Using An Algorithmic Approach To Develop Public-Domain Measures Of Indigenous Personality Traits, Mukhunth Raghavan Mar 2022

Creating A Short, Public-Domain Version Of The Cpai-2: Using An Algorithmic Approach To Develop Public-Domain Measures Of Indigenous Personality Traits, Mukhunth Raghavan

USF Tampa Graduate Theses and Dissertations

In this study we aimed to create a short, public-domain analogue of the Cross-Cultural (Chinese) Personality Assessment Inventory (CPAI-2; F. M. Cheung et al., 1996). Emic (culture-specific) traits measured by the CPAI-2 are purportedly specific to the Chinese culture and argued to not be fully captured by the consensus Big Five personality trait taxonomy. Research suggests that CPAI-2 traits may have unique predictive power, especially in non-Western contexts. However, research has been hampered by several limitations of the measure. The inventory is proprietary and long, with 341 items forming 28 scales and four factors. Cross-cultural personality research would benefit from …


Taking Multiple Regression Analysis To Task: A Review Of Mindware: Tools For Smart Thinking, By Richard Nisbett (2015), Jason Makansi Jul 2019

Taking Multiple Regression Analysis To Task: A Review Of Mindware: Tools For Smart Thinking, By Richard Nisbett (2015), Jason Makansi

Numeracy

Richard Nisbett. 2015. Mindware: Tools for Smart Thinking.(New York, NY: Farrar, Strauss, and Giroux). 336 pp. ISBN: 9780374536244

Nisbett, a psychologist, may not achieve his stated goal of teaching readers to “effortlessly” extend their common sense when it comes to quantitative analysis applied to everyday issues, but his critique of multiple regression analysis (MRA) in the middle chapters of Mindware is worth attention from, and contemplation by, the QL/QR and Numeracy community. While in at least one other source, Nisbett’s critique has been called a “crusade” against MRA, what he really advocates is that it not be used as …


Measuring Numeracy In A Community College Context: Assessing The Reliability Of The Subjective Numeracy Scale, Kate S. Wolfe, Sarah L. Hoiland Jul 2017

Measuring Numeracy In A Community College Context: Assessing The Reliability Of The Subjective Numeracy Scale, Kate S. Wolfe, Sarah L. Hoiland

Numeracy

In this paper, our goals were to assess the suitability of the Subjective Numeracy Scale (SNS), developed for health-care use, in a new context with predominantly minority students at a South Bronx community college and to identify any race/ ethnicity, gender, and ESL enrollment effects. The scale assesses perceptions of quantitative reasoning skills and preferences for data presentation. This scale was given to a convenience sample of students in behavioral sciences classes. Results show that the SNS scale was reliable with our sample using the full thirteen-question scale or the shorter eight-item version. Gender, race/ ethnicity, and English as a …


Effect Of Price Reduction And Increased Service Frequency On Public Transport Travel, Inge Brechan Mar 2017

Effect Of Price Reduction And Increased Service Frequency On Public Transport Travel, Inge Brechan

Journal of Public Transportation

A random effects meta-analysis of the results from 15 projects involving price reduction and 9 projects involving increased service frequency showed that both price reduction and increased service frequency generated public transport travels. On average, the increased service frequency projects generated more travels by public transport than the price reduction projects. In the increased service frequency projects the proportion of travels generated by the increased frequency was strongly influenced by the size of the frequency increase. In the price reduction projects, we did not find a significant effect of the size of the price reduction on the proportion of travels …


The Psychometric Evaluation And Validation Of A Measure Assessing Pharmacological And Social Alcohol Expectancies In Adolescents, Megan Victoria Mcmurray Jun 2016

The Psychometric Evaluation And Validation Of A Measure Assessing Pharmacological And Social Alcohol Expectancies In Adolescents, Megan Victoria Mcmurray

USF Tampa Graduate Theses and Dissertations

Extending prior alcohol expectancy measurement research, this researcher (McMurray, 2013) recently developed the Pharmacological and Social Alcohol Expectancy Scale (PSAES). The PSAES is the only alcohol expectancy measure to date that provides adequate coverage of both social expectancies and the anticipated positive pharmacological effects resulting from alcohol consumption, and was developed and validated in a sample of young adults (aged 18-23). Research has shown that adolescents at high risk for alcohol use disorder (AUD) hold higher expectations of reward from alcohol, suggesting that expectancy patterns may help distinguish at-risk youth. Building upon the previous PSAES validation study, the primary purpose …


Investigating Parameter Recovery And Item Information For Triplet Multidimensional Forced Choice Measure: An Application Of The Ggum-Rank Model, Philseok Lee Jun 2016

Investigating Parameter Recovery And Item Information For Triplet Multidimensional Forced Choice Measure: An Application Of The Ggum-Rank Model, Philseok Lee

USF Tampa Graduate Theses and Dissertations

To control various response biases and rater errors in noncognitive assessment, multidimensional forced choice (MFC) measures have been proposed as an alternative to single-statement Likert-type scales. Historically, MFC measures have been criticized because conventional scoring methods can lead to ipsativity problems that render scores unsuitable for inter-individual comparisons. However, with the recent advent of classical test theory and item response theory scoring methods that yield normative information, MFC measures are surging in popularity and becoming important components of personnel and educational assessment systems. This dissertation presents developments concerning a GGUM-based MFC model henceforth referred to as the GGUM-RANK. Markov Chain …


An Analysis Of Factor Extraction Strategies: A Comparison Of The Relative Strengths Of Principal Axis, Ordinary Least Squares, And Maximum Likelihood In Research Contexts That Include Both Categorical And Continuous Variables, Kevin Barry Coughlin Jan 2013

An Analysis Of Factor Extraction Strategies: A Comparison Of The Relative Strengths Of Principal Axis, Ordinary Least Squares, And Maximum Likelihood In Research Contexts That Include Both Categorical And Continuous Variables, Kevin Barry Coughlin

USF Tampa Graduate Theses and Dissertations

This study is intended to provide researchers with empirically derived guidelines for conducting factor analytic studies in research contexts that include dichotomous and continuous levels of measurement. This study is based on the hypotheses that ordinary least squares (OLS) factor analysis will yield more accurate parameter estimates than maximum likelihood (ML) and principal axis factor anlaysis (PAF); the level of improvement in estimates will be related to the proportion of observed variables that are dichotomized and the strength of communalities within the data sets.

To achieve this study's objective, maximum likelihood, ordinary least squares, and principal axis factor extraction models …


Pharmacological Versus Social Alcohol Expectancies: Making An Important Distinction Between The Anticipated Rewarding Effects Of Alcohol, Megan Victoria Mcmurray Jan 2013

Pharmacological Versus Social Alcohol Expectancies: Making An Important Distinction Between The Anticipated Rewarding Effects Of Alcohol, Megan Victoria Mcmurray

USF Tampa Graduate Theses and Dissertations

Despite over 30 years of research investigating alcohol expectancies, they have never been examined in terms of the anticipated pharmacological versus social rewards resulting from alcohol consumption, and both appear to play a central role in drinking motivation and behavior. The purpose of this study was to develop a two-dimensional instrument designed to assess both the pharmacological alcohol expectancies of pleasurable, internal states that result from alcohol consumption, as well as the social expectancies that drinking alcohol will result in higher social status and increased effectiveness in social situations. This measure, called the Pharmacological and Social Alcohol Expectancy Scale (PSAES), …


Assessing The Psychometric Properties Of A Self-Efficacy Measure Within A Patient Navigation Research Program, Mariana Arevalo Jun 2012

Assessing The Psychometric Properties Of A Self-Efficacy Measure Within A Patient Navigation Research Program, Mariana Arevalo

USF Tampa Graduate Theses and Dissertations

There is a dearth of validated self-efficacy (SE) measures in the field of preventive oncology. The objective of this study is to describe the development and validation of a measure to assess patients' perceived ability to obtain the recommended care following an abnormality suspicious for breast cancer. Guided by a social cognitive theory framework, a 51-item measure was developed to explore perceived capability to obtain follow up care under a number of barriers. A multi-step process was utilized to assess the instrument's psychometric properties. First, cognitive validity assessments with experts were conducted, and these aided in the wording refinement of …


Drug Courts Work, But How? Preliminary Development Of A Measure To Assess Drug Court Structure And Processes, Blake Barrett Jan 2011

Drug Courts Work, But How? Preliminary Development Of A Measure To Assess Drug Court Structure And Processes, Blake Barrett

USF Tampa Graduate Theses and Dissertations

The high prevalence of substance use disorders is well-documented among criminal offenders. Drug courts are specialty judicial programs designed to: 1) improve public safety outcomes; 2) reduce criminal recidivism and substance abuse among offenders with substance use disorders; and 3) better utilize scarce criminal justice and treatment resources. Drug courts operate through partnerships between the criminal justice, behavioral health and public health systems. Offenders participate in an intensive regimen of substance abuse treatment and case management while under close judicial supervision. Drug courts' effectiveness in reducing criminal recidivism and drug use has been documented through numerous primary studies as well …


When Does Fidelity Matter? An Evaluation Of Two Medical Simulation Methods, Nneka Joseph Jan 2011

When Does Fidelity Matter? An Evaluation Of Two Medical Simulation Methods, Nneka Joseph

USF Tampa Graduate Theses and Dissertations

Job or task simulations are used in training when the use of the real task is dangerous or expensive, such as flying aircraft or surgery. This study focused on comparing two types of simulations used in assessments during a Clinical Performance Examination of third-year medical students: computer enhanced mannequins and standardized patients. Each type of simulation has advantages, but little empirical work exists to guide the use of different types of simulation for training and evaluating different aspects of performance. This study analyzed performance scores for different competencies as well as the reliability and validity of the different simulation types. …