Ethics, Fairness(es), and Developments in Language Testing Liz - TopicsExpress



          

Ethics, Fairness(es), and Developments in Language Testing Liz Hamp-Lyons Hong Kong Polytechnic University I seem to have been struggling with questions to myself about the ethical aspects of language testing for a long time: in fact, this goes back at least to 1985, as the following extract from a draft of a chapter of my dissertation shows: Currently language testing seems to be moving to ..(what may be) ... referred to as an ethical phase. ... an ethical phase will not replace the previous phases (in LT) but in many contexts will exist alongside them. Also, a concern for ethicality is not a new development, for it has provided the underlying motivation for develpments in language testing from the beginning. Rather, it may be a shift into another dimension or domain. Three features mark the current period as deserving the epithet ethical. Firstly, an ethical imperative has meant that none of the groups concerned with language testing in earlier phases has been squeezed out: In fact, this shift has brought back classroom teachers in particular into this key area of their rightful concerns. Teachers judgements and ratings are being accorded a place once again; teachers responsibility for justifiable evaluations is being reaserted. In addition, testees themselves may be given a role, in self-evaluation, self-report, and peer evaluation. The scope may be widened still further, to include those concerned with the testee in language use, for instance, university supervisors, coursemates, flatmates, etc. The second feature is the increasing untenability of the position that ... the language tester is obliged to choose either reliability at the expense of (most kinds of) validity, or validity at the expense of reliability. Following from the period of initial enthusiasm for tests of communicative competence has come a concern to improve the reliability of these tests, while retaining the multidimensional validity that has been achieved. Closely allied to this is the third feature of the ethical phase, an increased attention to and sophistication in test validation activities. This draft continued by building up a rather gauche little diagram which purported to be a model of this ethical phase in language testing: it resembled a house, of which the roof, the all-encompassing structure, is ethicality. Alan Davies handwritten response to this draft, as my doctoral supervisor, says: Im still unhappy about the term Ethicality unless you stress the professional aspects and the humanistic orientation. I suppose we can distinguish Validity from its use (i.e., a test may be V. but you might not use it for E. reasons), but the E. is not, I suggest, then properly a quality of the test but a general attitude towards learners, learning, etc. I didnt at the time understand what Alan meant by professional aspects, but in recent papers he has given he has worked this through, and in his article in the 1997 Language Testing special issue, he has brought together this thinking. Nor did I at the time understand why he referred to the humanistic orientation, since it seemed obvious to me that ethical concerns would be humanistic concerns as well as technical concerns. Only slowly over the years since then have I gradually understood that not everyone accepted these fundamental humanistic underpinnings to work in language testing. Similarly, only slowly have I understood how complex these issues are, and that merely professing a humanistic or ethical concern does not make ones work ethical. As knowing what is ethical has become more difficult in this age of cultural and moral relativism, ethics have become a more important issue in many fields. In the epistemological crisis engendered by postmodernism, it is much more difficult to assert that any decision--or measurement--is right or true. While this makes our lives as practising language testers more difficult, in many ways it also liberates us to think seriously about what would, for us as individuals, or as conscious subscribers to a particular social compact, enable us to accept our own behaviours as contributing to the greater good of the greatest number of those whose lives we touch. In his comments on a 1989 paper of mine in his recent Langauge Testing paper, Alan Davies questions what he sees as my characterisation of ethics as made up of a combination of validity and backwash and suggests that I may have been appealing to consequential validity. With hindsight, that is probably true; but the terminology of consequential validity was not in common use, nor was it clearly defined at that time (indeed, we might question how well it is defined even now!). My 1985 draft turned out to be premature: once again with hindsight, the whole field of educational measurement was still rather naive in its views of the nature of ethical principles: it, and language testing within it, took too narrow a view of what the compass of our ethical responsibilities are. Few of us had thought about the meaning and responsibilities of ethics in sophisticated ways; only now is there any indication of a movement toward some agreement over professional ethics as related to social justice, and an understanding of how it is possible to hold a position such as that of philosopher Alasdair MacIntyre (1987), who see ethics as as much about politics and economics as about right and wrong. But perhaps we all needed to pass through those naive early stages before we could learn to challenge our own thinking, our own expectations of ourselves, and set our sights higher, or in different directions. Certainly my own early exchange of views with Alan Davies, leaving me dissatisfied as it did, forced me to think harder about why I felt there were ethical issues that needed to be addressed in language testing, and that thinking was reflected in my 1989 paper, and in more recent work (Hamp-Lyons, 1996, 1997). In what follows I explore some ideas generated by the recent considerable debate in educational measurement about fairness. I take fairness to be a member of a semantic set wth morality and ethics, and I ask: What is fairness? What makes a test fair? How do we know when a test is unfair? In keeping with the uncertainties of these relativist times, I find these questions increasingly difficult, and am increasingly unwilling to claim an ability to answer them. But I do feel that they all imply, and assume, some ideal model of fairness that is somewhere out there, waiting for us, if we only knew where to look. The introduction to this chapter clearly suggests the unlikelihood of the existence of such a model or solution; therefore it will not surprise the reader that in this paper I dont propose to look for that ideal model; rather, I want to raise some of the complicating situations and questions that occurred to me while I was musing on the elusiveness of that ideal model. The questioning and reflective mode of this chapter, as well as its subject matter, is, I believe, appropriate as a contribution to this tribute to the work of Alan Davies over the years and his influence on the work of many other language testers, myself included. 1. If it is true that: Language teaching as a field has not agreed whats the right way to teach or learn, and has not established a single dominant model for language teaching, it follows that students should be free to discover and then follow their own learning styles and learning strategies. Similarly, if it is true that: Language testing has not discovered a single dominant model of how to test a students learning, ability or performance; it follows that students should be free to consider their own learning history, their learning styles and strategies, and choose test and item types that best match their own learning profile. Tests, then, would need to exist in multiple forms so that each student could select a unique, appropriate pathway to demonstrating mastery, one which would be uniquely fair to her or him. 2. If it is true that: Students judgements of their own performances are heavily influenced by their teachers degree of harshness or leniency toward error, and by the performance targets their teachers set for them and accept from them, it follows that teachers need to be benchmarked so that students will have better self-knowledge, so that they will not be misled by their teachers encouragement to view themselves as more successful than they are, or by their teachers criticisms to view themselves as less successful then they are. From this it follows that teachers would need to be tested to ensure that they comprehend and can consistently apply the appropriate criteria and standards to learners in their classes. Teachers entering new teaching situations--new school years, new kinds of learners, teaching new skills, would need to take a re-benchmarking course and would be required to pass the course before teaching this new kind of learner. This kind of fairness places the needs of the teacher below the needs of the learner, because it states that standards and criteria are not negotiable. It does not, however, contradict the previous kind of fairness, because standards and criteria are distinct from styles and strategies, which when the teacher is in turn a rater, she or he can still choose freely. 3. If it is true that: Language testing has embraced post-modernism, and has accepted the fact that raters have personal philosophies and belief sets, and that it is a fiction to suppose that they can check these at the door, it follows that formal judgement systems should acknowledge this and figure out how to accommodate assessment systems to the rating styles and strategies of raters. Tests, then, would need to have multiple scoring alternatives so that each rater could select a unique, appropriate approach to scoring, one that would be uniquely fair to her or him. 4. If it is true that: Teachers are educated and trained in many different ways, and that every teacher, through education, experience, personality, interests and skills is different, classes by different teachers will not be the same, even if the syllabus is. It follows that teachers should be free to teach according to their own personal style,and that they should be free to assess, and have their students assessed, by their personal style. When assessments match instruction, not only in content but in style, there will be least dissonance for the teacher, and therefore for the learners. Tests, then, would need to exist in multiple forms so that each teacher could select a unique, appropriate pathway for her or his students to demonstrate mastery, one which would allow students and teacher to be seen in their best light by assessing in areas and in ways where they have the most strength. 5. If it is true that: Parents know their children best of all, have a set of social values, and have expectations of what their children should be able to do and how they should be doing it; if it is true that they also want to understand what happens in the classroom and the school much better than they do now, it follows that most kinds of tests will seem alienating for parents. Most tests are done in technical ways that exclude the parents, and they are reported in technical language, or simply with number scores which are not attached to actual examples of their childs performance. All this is clearly unfair to the parents. Tests would be fairer to parents if they were directly related to the content the children had been learning and that parents had been seeing in the homework assignments; they would be fairer to parents if they were scores in ways that parents could completely understand, and if parents were able to take part in the design of the test and its scoring method. Because parents understand their childrens learning needs and problems so well, it would be fairer to parents if they could take part in test design and could be trained as raters of the tests. Tests will only be fair to parents if test results/reports make complete sense to them, either because the reports are transparently descriptive, or because parents have been trained in test report interpretation in their own childrens context. There needs to be an appeal system parents can use to challenge their childs test score or the way the child was tested. Each of the fairnesses I have portrayed above focuses on being fair to one group of stakeholders: learners, raters, teachers, parents. It has not escaped me that there are some mutually contradictory strategies implied by these attempts to consider fairness from the viewpoint of different stakeholder groups. There are other stakeholder groups too: taxpayers, national and state Education Department officials, big business, political parties, and governments. If some of the creations of views of students, parents, teachers, and test raters seem far-fethced, I only ask that you spend time just listening to these groups discussing how they learn, what they believe about good teaching, what they worry about in their childs education, etc. I will agree with you that some of the suggestions I have voiced seem outrageous to us, as language testers, but only if you will agree with me that such suggestions are real, that you too have heard these and comments like them discussed in student focus groups, parents meetings, teachers common rooms, among practising teachers taking Masters courses in Language Testing, or chatting over tea during a rating session. What is outrageous, if anything, is the difficulty of making our tests fit these fairnesses, not the views themselves. It seems to me that none of them should be taken too lightly. Once language testers accept that there is no single right answer to issues in doing language testing, we also have to listen seriously to all views. Fairness is such a difficult concept because there is no one standpoint from which a test can be viewed as fair or not fair. The language tester has no more inherent right to decide what is fair for other people than anyone else does. But the language tester does have the responsibility to use all means to make any language test she or he is involved in as fair as possible. As our technical skills expand, as our definition of a test is refined, as our political consciousness of the power of tests is heightened, we raise our expectations of ourselves. Ethics then, for the language tester, involves decisions about whose voices are to be heard, whose needs are to be met; about how a society determines what is best for the largest number when fairnesses are in conflict. Language testing as a field is interestingly and challengingly about political and social needs and consequences, as much as it is about what is right and what is wrong. The time has arrived when we are obliged to critique everything that we do, and to take that critique onward and look at the impact we have on test-takers, other stakeholder groups, and on society, and we must not flinch from accepting some responsibility for the uses made of the tests we have been involved in: the fascinating and important question is, where and when do we decide to let our responsibility drop? Davies, A. 1997. Demands of being professional in language testing. Language Testing 14,3: 328-339. Hamp-Lyons, L. 1989. Language testing and ethics. Propect 5:7-15. Hamp-Lyons, L. 1996. Applying ethical standards to portfolio assessment of writing in English as a second language. in M. Milanovich and N. Saville (eds.), Performance Testing, Cognition and Assessment: Selected Papers from the 15th Language Testing Research Colloquium (pp. 151-164). Cambridge: Cambridge University Press. Hamp-Lyons, L. 1997. Washback, impact and validity: ethical concerns. Language Testing 14, 3: 295-303. MacIntyre, A. 1987. After Virtue. Oxford: Oxford University Press. https://google/search?q=ethical+standards+of+test+developments&oq=ethical+standards+of+test+developments&sugexp=chrome,mod=15&sourceid=chrome&ie=UTF-8#q=ethical+standards+of+test+developments&start=10
Posted on: Sat, 23 Nov 2013 12:40:18 +0000

Trending Topics



Recently Viewed Topics




© 2015