Open Access. Powered by Scholars. Published by Universities.®

Education Commons

Open Access. Powered by Scholars. Published by Universities.®

Theses/Dissertations

University of Iowa

Equating

Articles 1 - 13 of 13

Full-Text Articles in Education

Simple Structure Mirt Equating For Multidimensional Tests, Stella Yun Kim May 2018

Simple Structure Mirt Equating For Multidimensional Tests, Stella Yun Kim

Theses and Dissertations

Equating is a statistical process used to accomplish score comparability so that the scores from the different test forms can be used interchangeably. One of the most widely used equating procedures is unidimensional item response theory (UIRT) equating, which requires a set of assumptions about the data structure. In particular, the essence of UIRT rests on the unidimensionality assumption, which requires that a test measures only a single ability. However, this assumption is not likely to be fulfilled for many real data such as mixed-format tests or tests composed of several content subdomains: failure to satisfy the assumption threatens the ...


Irt Linking Methods For The Bifactor Model: A Special Case Of The Two-Tier Item Factor Analysis Model, Kyung Yong Kim Aug 2017

Irt Linking Methods For The Bifactor Model: A Special Case Of The Two-Tier Item Factor Analysis Model, Kyung Yong Kim

Theses and Dissertations

For unidimensional item response theory (UIRT) models, three linking methods, which are the separate, concurrent, and fixed parameter calibration methods, have been developed and widely used in applications such as vertical scaling, differential item functioning, computerized adaptive testing (CAT), and equating. By contrast, even though a few studies have compared the separate and concurrent calibration methods for full multidimensional IRT (MIRT) models or applied the concurrent calibration method to vertical scaling using the bifactor model, no study has yet provided technical descriptions of the concurrent and fixed parameter calibration methods for any MIRT models. Thus, the purpose of this dissertation ...


Subscore Equating With The Random Groups Design, Euijin Lim May 2016

Subscore Equating With The Random Groups Design, Euijin Lim

Theses and Dissertations

There is an increasing demand for subscore reporting in the testing industry. Many testing programs already include subscores as part of their score report or consider a plan of reporting subscores. However, relatively few studies have been conducted on subscore equating. The purpose of this dissertation is to address the necessity for subscore equating and to evaluate the performance of various equating methods for subscores.

Assuming the random groups design and number-correct scoring, this dissertation analyzed two sets of real data and simulated data with four study factors including test dimensionality, subtest length, form difference in difficulty, and sample size ...


A Comparison Of Smoothing Methods For The Common Item Nonequivalent Groups Design, Han Yi Kim Jul 2014

A Comparison Of Smoothing Methods For The Common Item Nonequivalent Groups Design, Han Yi Kim

Theses and Dissertations

The purpose of this study was to compare the relative performance of various smoothing methods under the common item nonequivalent groups (CINEG) design. In light of the previous literature on smoothing under the CINEG design, this study aimed to provide general guidelines and practical insights on the selection of smoothing procedures under specific testing conditions.

To investigate the smoothing procedures, 100 replications were simulated under various testing conditions by using an item response theory (IRT) framework. A total of 192 conditions (3 sample size × 4 group ability difference × 2 common-item proportion × 2 form difficulty difference × 1 test length × 2 common-item ...


Multidimensional Item Response Theory Observed Score Equating Methods For Mixed-Format Tests, Jaime Leigh Peterson Jul 2014

Multidimensional Item Response Theory Observed Score Equating Methods For Mixed-Format Tests, Jaime Leigh Peterson

Theses and Dissertations

The purpose of this study was to build upon the existing MIRT equating literature by introducing a full multidimensional item response theory (MIRT) observed score equating method for mixed-format exams because no such methods currently exist. At this time, the MIRT equating literature is limited to full MIRT observed score equating methods for multiple-choice only exams and Bifactor observed score equating methods for mixed-format exams. Given the high frequency with which mixed-format exams are used and the accumulating evidence that some tests are not purely unidimensional, it was important to present a full MIRT equating method for mixed-format tests.

The ...


Equating Multidimensional Tests Under A Random Groups Design: A Comparison Of Various Equating Procedures, Eunjung Lee Dec 2013

Equating Multidimensional Tests Under A Random Groups Design: A Comparison Of Various Equating Procedures, Eunjung Lee

Theses and Dissertations

The purpose of this research was to compare the equating performance of various equating procedures for the multidimensional tests. To examine the various equating procedures, simulated data sets were used that were generated based on a multidimensional item response theory (MIRT) framework. Various equating procedures were examined, including both unidimensional and the multidimensional equating procedures based on an IRT framework in addition to traditional equating procedures. Specifically, the performance of the following six equating procedures under the random groups design was compared: (1) unidimensional IRT observed score equating, (2) unidimensional IRT true score equating, (3) full MIRT observed score equating ...


A Comparison Of Van Der Linden's Conditional Equipercentile Equating Method With Other Equating Methods Under The Random Groups Design, Seonho Shin Jul 2011

A Comparison Of Van Der Linden's Conditional Equipercentile Equating Method With Other Equating Methods Under The Random Groups Design, Seonho Shin

Theses and Dissertations

To ensure test security and fairness, alternative forms of the same test are administered in practice. However, alternative forms of the same test generally do not have the same test difficulty level, even though alternative test forms are designed to be as parallel as possible. Equating adjusts for differences in difficulties among forms of the test. Six traditional equating methods are considered in this study: equipercentile equating without smoothing, equipercentile equating with pre-smoothing and post-smoothing, IRT true-score and observed-score equatings, and kernel equating. A common feature of all of the traditional procedures is that the end result of equating is ...


Assessing First- And Second-Order Equity For The Common-Item Nonequivalent Groups Design Using Multidimensional Irt, Benjamin James Andrews Jul 2011

Assessing First- And Second-Order Equity For The Common-Item Nonequivalent Groups Design Using Multidimensional Irt, Benjamin James Andrews

Theses and Dissertations

The equity properties can be used to assess the quality of an equating. The degree to which expected scores conditional on ability are similar between test forms is referred to as first-order equity. Second-order equity is the degree to which conditional standard errors of measurement are similar between test forms after equating. The purpose of this dissertation was to investigate the use of a multidimensional IRT framework for assessing first- and second-order equity of mixed format tests.

Both real and simulated data were used for assessing the equity properties for mixed-format tests. Using real data from three Advanced Placement (AP ...


Evaluating Equating Properties For Mixed-Format Tests, Yi He May 2011

Evaluating Equating Properties For Mixed-Format Tests, Yi He

Theses and Dissertations

Mixed-format tests containing both multiple-choice (MC) items and constructed-response (CR) items are used in many testing programs. The use of multiple formats presents a number of measurement challenges, one of which is how to adequately equate mixed-format tests under the common-item nonequivalent groups (CINEG) design, especially when, due to practical constraints, the common-item set contains only MC items. The purpose of this dissertation was to evaluate how equating properties were preserved for mixed-format tests under the CINEG design.

Real data analyses were conducted on 22 equating linkages of 39 mixed-format tests from the Advanced Placement (AP) Examination program. Four equating ...


Impact Of Matched Samples Equating Methods On Equating Accuracy And The Adequacy Of Equating Assumptions, Sonya Jean Powers Dec 2010

Impact Of Matched Samples Equating Methods On Equating Accuracy And The Adequacy Of Equating Assumptions, Sonya Jean Powers

Theses and Dissertations

This dissertation investigates the interaction of population invariance, equating assumptions, and equating accuracy with group differences. In addition, matched samples equating methods are considered as a possible way to improve equating accuracy with large group differences.

Data from one administration of four mixed-format Advanced Placement (AP) Exams were used to create pseudo old and new forms sharing common items. Population invariance analyses were conducted based on levels of examinee parental education using a single group equating design. Old and new form groups with common item effect sizes (ESs) ranging from 0 to 0.75 were created by sampling examinees based ...


The Impact Of Equating Method And Format Representation Of Common Items On The Adequacy Of Mixed-Format Test Equating Using Nonequivalent Groups, Sarah Lynn Hagge Jul 2010

The Impact Of Equating Method And Format Representation Of Common Items On The Adequacy Of Mixed-Format Test Equating Using Nonequivalent Groups, Sarah Lynn Hagge

Theses and Dissertations

Mixed-format tests containing both multiple-choice and constructed-response items are widely used on educational tests. Such tests combine the broad content coverage and efficient scoring of multiple-choice items with the assessment of higher-order thinking skills thought to be provided by constructed-response items. However, the combination of both item formats on a single test complicates the use of psychometric procedures. The purpose of this dissertation was to examine how characteristics of mixed-format tests and composition of the common-item set impact the accuracy of equating results in the common-item nonequivalent groups design.

Operational examinee item responses for two classes of data were considered ...


Observed Score And True Score Equating Procedures For Multidimensional Item Response Theory, Bradley Grant Brossman May 2010

Observed Score And True Score Equating Procedures For Multidimensional Item Response Theory, Bradley Grant Brossman

Theses and Dissertations

The purpose of this research was to develop observed score and true score equating procedures to be used in conjunction with the Multidimensional Item Response Theory (MIRT) framework. Currently, MIRT scale linking procedures exist to place item parameter estimates and ability estimates on the same scale after separate calibrations are conducted. These procedures account for indeterminacies in (1) translation, (2) dilation, (3) rotation, and (4) correlation. However, no procedures currently exist to equate number correct scores after parameter estimates are placed on the same scale. This research sought to fill this void in the current psychometric literature.

Three equating procedures--two ...


A Comparison Of Calibration Methods And Proficiency Estimators For Creating Irt Vertical Scales, Jungnam Kim Jan 2007

A Comparison Of Calibration Methods And Proficiency Estimators For Creating Irt Vertical Scales, Jungnam Kim

Theses and Dissertations

The main purpose of this study was to construct different vertical scales based on various combinations of calibration methods and proficiency estimators to investigate the impact different choices may have on these properties of the vertical scales that result: grade-to-grade growth, grade-to-grade variability, and the separation of grade distributions. Calibration methods investigated were concurrent calibration, separate calibration, and fixed a, b, and c item parameters for common items with simple prior updates (FSPU). Proficiency estimators investigated were Maximum Likelihood Estimator (MLE) with pattern scores, Expected A Posteriori (EAP) with pattern scores, pseudo-MLE with summed scores, pseudo-EAP with summed scores, and ...