Open Access. Powered by Scholars. Published by Universities.®
- Discipline
- Institution
- Publication
- Publication Type
Articles 1 - 9 of 9
Full-Text Articles in Education
Bagaimana Hasil Penyetaraan Paket Tes Usbn Pada Mata Pelajaran Matematika Dengan Teori Respon Butir?, Eri Yusron, Heri Retnawati, Ibnu Rafi
Bagaimana Hasil Penyetaraan Paket Tes Usbn Pada Mata Pelajaran Matematika Dengan Teori Respon Butir?, Eri Yusron, Heri Retnawati, Ibnu Rafi
Jurnal Riset Pendidikan Matematika
Dalam melakukan standarisasi pendidikan di Indonesia maka dilakukan Ujian Sekolah Berstandar Nasional (USBN).Karena perangkat tes yang digunakan dalam USBN tersebut sebagian besar butir tesnya dikembangkan oleh guru, perangkat tes yang diujikan kepada siswa di daerah yang satu akan berbeda dengan daerah yang lainnya meskipun sama-sama mengacu pada kisi-kisi dari pemerintah. Oleh karena itu perlu dilakukan penyetaraan perangkat tersebut. Penelitian deskriptif eksploratif dengan pendekatan kuantitif ini bertujuan untuk mendeskripsikan kesetaraan perangkat USBN tahun 2018/2019 pada mata pelajaran matematika wajib. Pengumpulan data dilakukan melalui dokumentasi respon siswa pada USBN 2018/2019 untuk mata pelajaran matematika wajib. Respon siswa tersebut berasal dari lima paket …
Examining The Effects Of Changes In Automated Rater Bias And Variability On Test Equating Solutions, Michelle Boyer
Examining The Effects Of Changes In Automated Rater Bias And Variability On Test Equating Solutions, Michelle Boyer
Doctoral Dissertations
Many studies have examined the quality of automated raters, but none have focused on the potential effects of systematic rater error on the psychometric properties of test scores. This simulation study examines the comparability of test scores under multiple rater bias and variability conditions, and addresses questions of their effects on test equating solutions. Effects are characterized by a comparison of equated and observed raw scores and estimates of examinee ability across the bias and variability scenarios. Findings suggest that the presence of, and changes in, rater bias and variability affect the equivalence of total raw scores, particularly at higher …
Evaluating The Impact Of Construct Shift On Item Parameter Invariance, Test Equating And Proficiency Estimates, Xueming Li
Doctoral Dissertations
Common Core State Standards in English Language Arts and Mathematics at grades K to 12 were introduced in 2009 and at one time had been accepted by 45 of the states in the U.S. The new standards have created national curricula in these two subject areas. Along with this reform, new assessment systems have been developed too. Many of these new tests are showing signs of being more multidimensional than the tests they were replacing because of the use of new item formats, and the assessment of higher level thinking skills and various performance skills. In the short term at …
An Investigation Of Subtest Score Equating Methods Under Classical Test Theory And Item Response Theory Frameworks, Minjeong Shin
An Investigation Of Subtest Score Equating Methods Under Classical Test Theory And Item Response Theory Frameworks, Minjeong Shin
Doctoral Dissertations
Test scores are usually equated only at the total score level. If a test mainly measures a single trait, indicating that the test is essentially unidimensional, equating at the total score level could be the best choice. However, when a test is composed of subtests having negligible relationships among them, separate equating for each subtest offers the best choice. Given a moderate amount of correlations among the subtests, performing individual equating for each subtest may be misleading in that it ignores the relationship of the subtests. This study applied and compared several possible subtest score equating methods based on classical …
Pengaruh Metode Dan Ukuran Sampel Terhadap Variansi Skor Hasil Penyetaraan, Tri Rijanto
Pengaruh Metode Dan Ukuran Sampel Terhadap Variansi Skor Hasil Penyetaraan, Tri Rijanto
Jurnal Penelitian dan Evaluasi Pendidikan
Penelitian ini bertujuan untuk memperoleh informasi perbedaan variansi skor hasil penyetaraan (equating) metode linear dan metode eki-persentil untuk ukuran sampel 200, 400, dan 800 pada Ujian Akhir Sekolah Berstandar Nasional (UASBN). Metode yang digunakan adalah simulasi de-ngan variabel metode penyetaraan dan banyaknya responden. Data pene-litian berupa respons peserta UASBN SD/MI tahun pelajaran 2008/2009 mata pelajaran IPA di Jakarta Timur, yang ditentukan menggunakan teknik penarikan sampel acak dengan pengembalian. Hipotesis diuji menggunakan uji kesamaan variansi. Hasil penelitian denganα = 0,05 menunjukkan: (1) variansi skor penyetaraan metode ekipersentil (σ2ekp200) tidak berbeda de-ngan variansi skor penyetaraan metode linear …
Investigating How Equating Guidelines For Screening And Selecting Common Items Apply When Creating Vertically Scaled Elementary Mathematics Tests, Maria Assunta Hardy
Investigating How Equating Guidelines For Screening And Selecting Common Items Apply When Creating Vertically Scaled Elementary Mathematics Tests, Maria Assunta Hardy
Theses and Dissertations
Guidelines to screen and select common items for vertical scaling have been adopted from equating. Differences between vertical scaling and equating suggest that these guidelines may not apply to vertical scaling in the same way that they apply to equating. For example, in equating the examinee groups are assumed to be randomly equivalent, but in vertical scaling the examinee groups are assumed to possess different levels of proficiency. Equating studies that examined the characteristics of the common-item set stress the importance of careful item selection, particularly when groups differ in ability level. Since in vertical scaling cross-level ability differences are …
Item Parameter Drift As An Indication Of Differential Opportunity To Learn: An Exploration Of Item Flagging Methods & Accurate Classification Of Examinees, Tia M. Sukin
Open Access Dissertations
The presence of outlying anchor items is an issue faced by many testing agencies. The decision to retain or remove an item is a difficult one, especially when the content representation of the anchor set becomes questionable by item removal decisions. Additionally, the reason for the aberrancy is not always clear, and if the performance of the item has changed due to improvements in instruction, then removing the anchor item may not be appropriate and might produce misleading conclusions about the proficiency of the examinees. This study is conducted in two parts consisting of both a simulation and empirical data …
Exploring The Efficacy Of Pre-Equating A Large Scale Criterion-Referenced Assessment With Respect To Measurement Equivalence, Christopher Stephen Domaleski
Exploring The Efficacy Of Pre-Equating A Large Scale Criterion-Referenced Assessment With Respect To Measurement Equivalence, Christopher Stephen Domaleski
Educational Policy Studies Dissertations
This investigation examined the practice of relying on field test item calibrations in advance of the operational administration of a large scale assessment for purposes of equating and scaling. Often termed “pre-equating,” the effectiveness of this method is explored for a statewide, high-stakes assessment in grades three, five, and seven for the content areas of language arts, mathematics, and social studies. Pre-equated scaling was based on item calibrations using the Rasch model from an off-grade field test event in which students tested were one grade higher than the target population. These calibrations were compared to those obtained from post-equating, which …
Equating Multiple Forms Of A Competency Test: An Item Response Theory Approach, Christine E. Demars
Equating Multiple Forms Of A Competency Test: An Item Response Theory Approach, Christine E. Demars
Department of Graduate Psychology - Faculty Scholarship
A competency test was developed to assess students' skills in using electronic library resources. Because all students were required to pass the test, and had multiple opportunities to do so, multiple test forms were desired. Standards had been set on the original form, and minor differences in form difficulty needed to be taken into account. Students were randomly administered one of six new test forms; each form contained the original items and 12 pilot items which were different on each form. The pilot items were then calibrated to the metric of the original items and incorporated in two additional operational …