Open Access. Powered by Scholars. Published by Universities.®

Education Commons

Open Access. Powered by Scholars. Published by Universities.®

Selected Works

Testing

Discipline
Publication Year
Publication
File Type

Articles 31 - 46 of 46

Full-Text Articles in Education

Developing Tests And Questionnaires For A National Assessment Of Educational Achievement, Prue Anderson, George Morgan Dec 2007

Developing Tests And Questionnaires For A National Assessment Of Educational Achievement, Prue Anderson, George Morgan

Prue Anderson

The authors introduce readers to the activities involved in the development of achievement tests, including developing as assessment framework, writing multiple choice and constructed response type items, pretesting, producing test booklets, and handscoring items. A section on questionnaire construction features designing questionnaires, writing questions, coding responses, and linking questionnaire and test score data. The final section covers the development of a test administration manual, selecting test administrators, and contacting sampled schools. A companion CD contains examples of released items from national and international tests, sample questionnaires, and administrative manuals. [Back cover, ed]


Ameliorating Culturally Based Extreme Response Tendencies To Attitude Items, Maurice Walker Dec 2006

Ameliorating Culturally Based Extreme Response Tendencies To Attitude Items, Maurice Walker

Maurice Walker

No abstract provided.


The Influence Of Equating Methodology On Reported Trends In Pisa, Eveline Gebhardt, Ray Adams Dec 2006

The Influence Of Equating Methodology On Reported Trends In Pisa, Eveline Gebhardt, Ray Adams

Prof Ray Adams

In 2005 PISA published trend indicators that compared the results of PISA 2000 and PISA 2003. This paper explores the extent to which the outcomes of these trend analyses are sensitive to the choice of test equating methodologies, the choice of regression models and the choice of linking items. To establish trends, PISA equated its 2000 and 2003 tests using a methodology based on Rasch Modelling that involved estimating linear transformations that mapped 2003 Rasch-scaled scores to the previously established PISA 2000 Rasch-scaled scores. This paper compares the outcomes of this approach with an alternative, which involves the joint Rasch …


The Impact Of Differential Investment Of Student Effort On The Outcomes Of International Studies, J Butler, Ray Adams Dec 2006

The Impact Of Differential Investment Of Student Effort On The Outcomes Of International Studies, J Butler, Ray Adams

Prof Ray Adams

International comparative assessments of student achievement, such as Trends in Mathematics and Science (TIMSS) and Programme for International Student Achievement (PISA) are becoming increasingly important in the development of evidence-based education policy. The potentially far-reaching influence of such studies underscores the need for these assessments to be valid and reliable. In education, increasing recognition is being given to motivational factors which impact on student learning. This research considers a possible threat to the validity of such studies by investigating the influence the amount of effort invested by test-takers has on their outcomes. Reassuringly, it is found that the reported expenditure …


Modelling Mathematics Problem Solving Item Responses Using A Multidimensional Irt Model, Margaret Wu, Ray Adams Sep 2006

Modelling Mathematics Problem Solving Item Responses Using A Multidimensional Irt Model, Margaret Wu, Ray Adams

Prof Ray Adams

This research examined students' responses to mathematics problem- solving tasks and applied a general multidimensional IRT model at the response category level. In doing so, cognitive processes were identified and modelled through item response modelling to extract more information than would be provided using conventional practices in scoring items. More specifically, the study consisted of two parts. The first part involved the development of a mathematics problem-solving framework that was theoretically grounded, drawing upon research in mathematics education and cognitive psychology. The framework was then used as the basis for item development. The second part of the research involved the …


Mainstream First-Grade Teachers' Understanding Of Strategies For Accommodating The Needs Of English Language Learners, Clare Hite, Linda Evans Dec 2005

Mainstream First-Grade Teachers' Understanding Of Strategies For Accommodating The Needs Of English Language Learners, Clare Hite, Linda Evans

Linda S. Evans

In this time of high stakes testing, teachers' working with English Language Learners (ELLs) becomes a high-stakes teaching act. Nationally, mandated testing is increasing in the schools even as school demographics are changing. The growing numbers of language-minority students come with varying levels of English proficiency, from little or none to fluent bilingualism. Teachers find it difficult to bring all their native-English-speaking children along to an acceptable level of performance in literacy and content-area subjects; ELLs present an even greater challenge, particularly for the elementary mainstream classroom teachers who are the primary language teachers for most young ELLs, yet typically …


High-Stakes Testing: Can Rapid Assessment Reduce The Pressure?, Stuart S. Yeh Dec 2005

High-Stakes Testing: Can Rapid Assessment Reduce The Pressure?, Stuart S. Yeh

Stuart S Yeh

This article presents findings about the implementation of a system for rapidly assessing student progress in math and reading in grades K–12—a system that potentially could reduce pressure on teachers resulting from high-stakes testing and the implementation of the No Child Left Behind Act. Interviews with 49 teachers and administrators in one Texas school district suggest that the assessments allowed teachers to individualize and target instruction; provide more tutoring; reduce drill and practice; and improve student readiness for, and spend more time on, critical thinking activities, resulting in a more balanced curriculum. Teachers reported that the assessments provided a common …


All Is Happening, With Numeracy Included, Dave Tout Dec 2004

All Is Happening, With Numeracy Included, Dave Tout

David (Dave) Tout

The Adult Literacy and Lifeskills (ALL) Survey (formerly known as the International Life Skills Survey (ILSS)) is a large-scale, comparative survey that goes beyond previous international literacy studies. In addition to the literacy skills measured in the previous International Adult Literacy Survey (IALS), ALL is designed to identify and measure a broader range of skills in the adult population in each participating country. The skills to be directly measured are: prose and document literacy; numeracy; problem solving/analytical reasoning. In addition the assessment will be accompanied by a comprehensive Background Questionnaire, which will collect participant information and indirectly measure two other …


Assessing Second Language Writing: The Rater’S Perspective, Tom Lumley Dec 2004

Assessing Second Language Writing: The Rater’S Perspective, Tom Lumley

Dr Tom Lumley

This study investigates the process of rating texts written by adult ESL learners. Four experienced raters provided think-aloud protocols describing the rating process for a set of 24 texts. The think-aloud data allowed analysis of the sequence of rating, raters' interpretations of the scoring categories, and difficulties raters faced. The study reveals the complexity of the rating process, whereby raters struggle to resolve a tension between the wordings (or rules) of the rating scale and their complex, initial, intuitive impression of the text. Rating requires training to provide reliable measurement. The study also demonstrates that caution is needed in interpreting …


Examining The Evidence : Science Achievement In Australian Schools In Timss 2002, Sue Thomson, Nicole Fleming Dec 2003

Examining The Evidence : Science Achievement In Australian Schools In Timss 2002, Sue Thomson, Nicole Fleming

Nicole Wernert

Australia, 10030 students in 414 schools participated in the main sample of TIMSS 2002/03. ...Results are reported as average scores with the standard error, as distributions of scores, and as percentages of students who attain the international benchmarks, for countries and specific groups of students within Australia.


Summing It Up : Mathematics Achievement In Australian Schools In Timss 2002, Nicole Fleming, Sue Thomson Dec 2003

Summing It Up : Mathematics Achievement In Australian Schools In Timss 2002, Nicole Fleming, Sue Thomson

Nicole Wernert

This document analyses and interprets the Australian data collected as part of the TIMSS study for Year 4 and Year 8 students.


Assessment Criteria In A Large-Scale Writing Test: What Do They Really Mean To The Raters?, Tom Lumley Dec 2001

Assessment Criteria In A Large-Scale Writing Test: What Do They Really Mean To The Raters?, Tom Lumley

Dr Tom Lumley

The process of rating written language performance is still not well understood, despite a body of work investigating this issue over the last decade or so (e.g., Cumming, 1990; Huot, 1990; Vaughan, 1991; Weigle, 1994a; Milanovic et al., 1996). The purpose of this study is to investigate the process by which raters of texts written by ESL learners make their scoring decisions using an analytic rating scale designed for multiple test forms. The context is the Special Test of English Proficiency (step), which is used by the Australian government to assist in immigration decisions. Four trained, experienced and reliable step …


The Effect Of Interlocutor And Assessment Mode Variables In Overseas Assessments Of Speaking Skills In Occupational Settings, T Mcnamara, Tom Lumley Jun 1997

The Effect Of Interlocutor And Assessment Mode Variables In Overseas Assessments Of Speaking Skills In Occupational Settings, T Mcnamara, Tom Lumley

Dr Tom Lumley

The increasing demand for performance assessment of speaking skills in second languages has led to logistic complications, for example, the delivery of tests in overseas locations. One solution to the problem has been to train native interlocutors to carry out a series of oral interactions with the candidate, with assessment from audiorecordings of the test session postponed and conducted centrally by a small team of trained raters. But these procedures raise questions about the effect of such facets of the assessment situation as interlocutor variables and the quality of the audiotape recording. This article examines these issues in the context …


Anchor Tests, Score Equating And Sex Bias, Geoff Masters Dec 1987

Anchor Tests, Score Equating And Sex Bias, Geoff Masters

Prof Geoff Masters AO

This paper discusses the use of anchor tests (scaling tests) to bring two or more sets of scores to a common scale. Particular attention is given to the rescaling of school based assessments against an external test or examination and to potential sources of bias in this procedure. The need for routine validity checks is emphasised, and a latent trait approach to constructing a statistical framework for tests and examination score equating is described and illustrated. Bias caused by rescaling school assessments against an inappropriate anchor test is illustrated using a 1984 attempt to rescale students assessments in English against …


Item Discrimination: When More Is Worse, Geoff Masters Dec 1987

Item Discrimination: When More Is Worse, Geoff Masters

Prof Geoff Masters AO

High item discrimination can be a symptom of a special kind of measurement disturbance introduced by an item that gives persons of high ability a special advantage over and above their higher abilities. This type of disturbance, which can be interpreted as a form of item bias, can be encouraged by methods that routinely interpret highly discriminating items as the best items on a test and may be compounded by procedures that weight items by their discrimination. The type of measurement disturbance described and illustrated in this paper occurs when an item is sensitive to individual differences on a second, …


Banking Non-Dichotomously Scored Items, Geoff Masters, John Evans Dec 1985

Banking Non-Dichotomously Scored Items, Geoff Masters, John Evans

Prof Geoff Masters AO

A method for constructing a bank of items scored in two or more ordered response categories is described and illustrated. This method enables multistep problems, rating scale items, question 'clusters', and other items using partial credit scoring to be calibrated and incorporated into an item bank, and it provides a mechanism for computer adaptive testing with items of this type. Procedures are described for calibrating an initial set of items, for testing the fit of items to the underlying measurement model, and for linking new items to an existing item bank. The method is illustrated using items from the Watson-Glaser …