Wednesday 28 March 2012

Unit-9 Evaluation



Achievement test
A standardized test used to measure acquired learning, in a specific subject area such as reading or arithmetic, in contrast to an intelligence test, which measures potential ability or learning capacity.
Achievement test is an examination administered to determine how much a person has learned, or how much knowledge a person has or acquired. (noun)
An example of an achievement test is the Regent's exam that students must take to prove they have learned math and other academic lessons.
                            Standardized test
         Diagnostic Tests for Preschoolers -After a screening, tests for diagnostic assessment are administered (if needed).
         Adaptive behavior measures assess possible learning, social or motor disabilities.
         Intelligence tests measure learning potential.
         Achievement Tests Determine Instructional Effectiveness
National tests:
         compare student achievement across states to address higher standards for education
         identify poor instructional areas, pinpoint weaknesses in a state’s instructional program and facilitate improvements
State developed tests are used at school districts to:
determine each student’s progress
      provide diagnostic information on a child’s needs for future instruction  
     describe student progress between and within schools
Standardized Tests and Teaching
          The Nature of Standardized Tests
         Steps in Standardized Test Design
The following steps ensure that the test achieves its goals 
and purposes:
·        specify the purpose
·        determine the format
·        formulate objectives
·        test construction: write, try out, and analyze items
·        assemble the final form
·        administer the final test form
·        establish norms, determine the validity and reliability
·        develop a test manual
Steps in Standardized Test Design: Specifying the Purpose
        A clearly defined purpose is the framework for the construction of the test.
         It allows evaluation of the instrument when design and construction steps are completed.
         It helps explain what the test will measure, how the test results will be used, and who will take the test. 
         It describes the population for whom the test is intended.
         Steps in Standardized Test Design: Determining Test Format
Format decisions are based on the purpose of the test and the characteristics of the test takers:
         how test items will be presented and how the test taker will respond
          (e.g., tests designed for very young children are usually presented orally; paper and pencil tests for older students)
         given as a group test or as an individual test
         Steps in Standardized Test Design: Test Construction
The test’s purpose guides:
         defining test objectives
         writing test items for each objective
         assembling experimental test forms
         Steps in Standardized Test Design: Developing Experimental Forms
For a school achievement test:
         test content is delimited
         curriculum is analyzed to ensure that the test will reflect the instructional programs
         teachers and curriculum experts review content outlines and objectives for the test; and later they review test items
         writing, editing, trying out, and rewriting or revising test items
         a preliminary test with selected test items is assembled for trial with a sample of students
         Steps in Standardized Test Design: Developing Experimental Forms
         Experimental test forms resemble the final form
         Instructions are written for test administration
         The sample of people who to take the preliminary test is similar to the population that will take the final form of the test
         Steps in Standardized Test Design: Item Analysis in The Test Tryout Phase
Study each item’s:           
         Difficulty level: how many test takers in the trial group answered the question correctly
         Discrimination: the extent to which the question distinguishes between test takers who did well or poorly; test takers who did well on the test should be more successful on the item than those who did poorly
         Grade progression in difficulty: for tests that are taken in different grades, in each successively higher grade, a greater percentage of students should answer it correctly
         Steps in Standardized Test Design: The Final Test Form Is Assembled
         In item analysis, test items were revised, or eliminated
         Test items that measure each test objective are selected for the test
         Alternative forms of one test must ensure that each of the forms are equivalent in content and difficulty
 Test directions are finalized-- with instructions for test administrators about the testing environment and testing procedures; and instructions for test takers
         Steps in Standardized Test Design:
         Standardizing the Test
The final test form is administered to another, larger sample of test takers to acquire norm data.
Norms allow for comparisons of children’s test  performance with the performance of a reference or norming group.
         The norming group is chosen to reflect the makeup of the population for whom the test is designed.
           Evaluating Standardized Tests
Reliability – Are test scores stable, dependable and relatively free from error?
Validity – Does the test measure what it is supposed to measure?
                       Correlation
Correlation Coefficient is a statistical measure of relationship between two variables.
         Pearson correlation coefficient
         r = the Pearson coefficient
         r measures the amount that the two variables (X and Y) vary together (i.e., covary) taking into account how much they vary apart
         Pearson’s r is the most common correlation coefficient; there are others.
         Computing the Pearson correlation coefficient
         To put it another way:
Sum of Products of Deviations
         Measuring X and Y individually (the denominator):
       compute the sums of squares for each variable
         Measuring X and Y together: Sum of Products
       Definitional formula
       Computational formula
         n is the number of (X, Y) pairs
            Correlation Coefficent:
         the equation for Pearson’s r:
         expanded form:
Example
         What is the correlation between study time and test score:
         Calculating values to find the SS and SP:
         Calculating SS and SP
         Calculating r
         Correlation Coefficient Interpretation

                   Reliability
         Test-retest: The extent to which a test yields the same score when given to a student on two different occasions
         Alternate-forms: Two different forms of the same test on two different occasions to determine the consistency of the scores
         Split-half: Divide the test items into two halves; scores are compared to determine test score consistency
         Standard Error of Measurement
         an estimate of the amount of variation to be expected in test scores.
         If the reliability correlations are poor, the standard error of measurement will be large.
         The larger the standard error of measurement, the less reliable the test.
         Variables That Affect the Standard Error of Measurement
The following affect test reliability:
         Population sample size --the larger the population sample, the more reliable the test
         Test length --longer tests are usually more reliable because there are more test items, resulting in a better sample of behaviors
         Range of test scores of the forming group --the wider the spread of scores, the more reliably the test can distinguish between good and poor students
         Types of Validity…
         Content:  Test’s ability to sample the content that is being measured
         Criterion-related:
1.     Concurrent: The relation between a test’s score and other available criteria
2.     Predictive: The relationship between test’s score and future performance
         Construct: The extent to which there is evidence that a test measures a particular construct
         Considerations in Choosing and Evaluating Tests
To select the best test to meet the developmental characteristics of young children, the following need to be considered:
         the purpose of the testing
         the characteristics to be measured
         how the test results will be used
         the qualifications of those who will interpret the scores and use the results
         any practical constraints: cost, time, ease of scoring and use of test results

                  Reviewing the Test Manual
          The test manual should include information that is adequate for users to determine whether the test is practical and suitable for their purposes.

          The manual should address the following:
         Purpose of the test
         Test design
         Establishment of validity and reliability
         Test administration and scoring
                              DIAGNOSTIC TEACHING      
Diagnostic teaching is the “process of diagnosing student abilities, needs and objectives and prescribing requisite learning activities”.   Through diagnostic teaching, the teacher monitors the understanding and performance of students before teaching the lesson, while teaching, and after teaching the lesson. Diagnostic teaching can inform teachers of the effectiveness of their lessons with individuals, small groups of students, or whole classes, depending on the instruments used. Within a diagnostic teaching perspective, assessment and instruction are interacting and continuous processes, with assessment providing feedback to the teacher on the efficacy of prior instruction, and new instruction building on the learning that students demonstrate.
Teachers may evaluate student learning on the spot, or collect data at different points in time and compare progress over units of instruction. Moment-by-moment assessments allow teachers to tap into students’ developing understandings about reading and students’ use of strategic processing to understand and remember text, and enable teachers to correct misconceptions immediately. Observations recorded over time allow teachers to identify patterns of development and document learning gains.  Both “on-the-run” assessments and systematic records of teachers’ observations of students’ learning over time can supplement the more quantitative and summative assessments that the ministry or school mandates and are more likely than end-of-term assessments to develop teachers’ capacity to improve the quality and appropriateness of instruction.  
Diagnostic assessments are themselves educative for teachers.   By introducing the concept of diagnostic teaching and the monitoring techniques to support such instruction, teachers will be better able to recognize reading as a developmental process and target instruction to meet the needs of individuals and groups.  As students progress toward reading proficiency, they gain control over different components of the reading process.  Yet, not all students will be at the same level of proficiency or need the same instruction.  Students progress through overlapping stages in a developmental sequence that leads to proficiency in reading.  Starting with “visual-cue” word recognition, wherein students memorize the configuration of words, to increasing awareness of phonology and the way sounds map onto letters in an alphabetic language, students gradually consolidate their use of larger letter patterns to recognize words effortlessly and automatically.  At the “automatic word recognition” stage, students are able to orally read text fluently, with speed and prosody.  As students become fluent readers, they are likely to devote more attention to comprehension, routinely using background knowledge and strategic processing to understand and remember text.  By the time students reach “reading proficiency,” their reading comprehension equals or surpasses what they can glean from listening to lecture presentations. Nonetheless, not all students’ progress through these stages at the same rate, and in any given classroom, there will be students who need different kinds of support from their teachers. For example, some students may be able to decode words but only slowly and with great effort.  Others may be fluent word callers but lack vocabulary and the ability to read strategically for comprehension.  Thus, students’ vocabulary, background knowledge, fluency, interest and motivation, as well as the ability to accurately identify words, all influence their reading comprehension. 
Through professional development, teachers will be able to recognize the importance of the various components of the reading process and identify and use assessment and instruction to support the development of these components.  Teachers who participate in the diagnostic teaching workshops will have a more elaborated view of the reading process, beyond students’ ability to decode words and memorize text.  In particular, teachers will recognize the importance of automatic word recognition, that is, the ability to read high frequency words and phonetically regular words accurately and fluently, and the importance of strategic reading, that is, the ability to make inferences and monitor and repair understanding when reading different genres of text. 
The diagnostic teaching techniques presented in the workshops cover the entire continua of reading development. Teachers will be able to demonstrate how to assess and support students’ emerging reading behaviors, such as concepts about print and basic decoding, among the youngest or least experienced readers.   By using a fluency rating rubric, such as that developed for the National Assessment of Educational Progress, and determining students’ words correct per minute on an oral reading of a short passage, teachers also will be able to identify more experienced students’ fluency and put into practice activities to support automatic word recognition, such as dramatic readings or rereadings of text.  Teachers will demonstrate how to assess and instruct students in comprehension strategies before, during and after reading.  Before reading, teachers will assess whether students can preview, set a purpose for reading, and bring prior knowledge to bear on the topic of the reading.  During reading, teachers will assess students’ ability to develop predictions and questions, and monitor text understanding.  After reading, teachers will assess whether students can remember or summarize what they read.  Teachers will learn as well to analyze and interpret non-traditional ways of processing and responding to text, such as verbal protocols and drawings that make visible the thinking of students for teachers to evaluate.
As teachers learn to administer and interpret these diagnostic assessments, they themselves will develop a more elaborated understanding of the reading process.  In order to assess comprehension of narrative text and the adequacy of students’ retellings, for example, teachers must themselves learn to identify the story elements of character, plot, setting, problem and resolution.  Likewise, in order to evaluate whether students are able to navigate nonfiction texts, teachers first must be able to recognize the organizational patterns in particular informational texts, for example, cause and effect, or comparison and contrast, and then identify and teach the comprehension strategies that students are not using.
Achievement tests are exams that are designed to determine the degree of knowledge and proficiency exhibited by an individual in a specific area or set of areas. An achievement test is sometimes administered as part of the acceptance process into an educational program 
In other applications, the achievement test serves as a tool to measure current knowledge levels for the purpose of placing students in an educational environment where they have the chance to advance at a pace that is suitable for their abilities.

Achievement is testing to identify students who are prepared to move on to more advanced courses of study or who need some type of remedial instruction. The idea behind using an achievement test format to measure the grade level for each student is not intended to reflect on the general intelligence of the individual. Rather, the purpose of the testing is to ensure each student is placed in a classroom situation where there is the best opportunity for learn and assimilate data in an organized fashion that prepares them for moving on to more advanced material.

For example, a student who does not do well with basic mathematics on an achievementtest is likely to be placed in a remedial learning situation. Doing so provides the student with the opportunity to master the basics before attempting to learn more advanced mathematical concepts like algebra or geometry. At a later date, the student will have the chance to take a second test; should the results indicate that the student is sufficiently prepared to move on to something more complicated, he or she can be reassigned to a more challenging course of study.

Individuals and groups that oppose the use of an achievement test claim that the exams are not structured in a manner that accounts for the general aptitude of each student, resulting in the creation of an overall learning environment that pigeonholes rather than nurtures each student and promote productive learning.

It is not unusual for many school jurisdictions to make use of a sample achievement test several weeks before administering the live exam. The idea behind the achievement test practice run is to allow students to get an idea of the general format of the exam and what type of instructions apply to each section. While the specific questions used in the sample are different those utilized in the live achievement test, they are usually close enough to provide the student with an idea of what to expect.



QUALITIES OF A GOOD TEST

A good test should possess the following qualities.

• Objectivity
• Objective Basedness
• Comprehensiveness
• Validity
• Reliability
• Practicability
• Comparability
• Utility

Objectivity 
• A test is said to be objective if it is free from personal biases in interpreting its scope as well as in scoring the responses.

• Objectivity of a test can be increased by using more objective type test items and the answers are scored according to model answers provided.
Objective Basedness
• The test should be based on pre-determined objectives.

• The test setter should have definite idea about the objective behind each item.
Comprehensiveness
• The test should cover the whole syllabus.
• Due importance should be given all the relevant learning materials.
• Test should be cover all the anticipated objectives.
Validity 
• A said to be valid if it measures what it intends to measure.

• There are different types of validity:
– Operational validity
– Predictive validity
– Content validity
– Construct validity
• 
Operational Validity
– A test will have operational validity if the tasks required by the test are sufficient to evaluate the definite activities or qualities.
 Predictive Validity
– A test has predictive validity if scores on it predict future performance
• Content Validity
– If the items in the test constitute a representative sample of the total course content to be tested, the test can be said to have content validity.
• 
Construct Validity
– Construct validity involves explaining the test scores psychologically. A test is interpreted in terms of numerous research findings.
Reliability 
• Reliability of a test refers to the degree of consistency with which it measures what it indented to measure.

• A test may be reliable but need not be valid. This is because it may yield consistent scores, but these scores need not be representing what exactly we want to measure.

• A test with high validity has to be reliable also. (the scores will be consistent in both cases)

• Valid test is also a reliable test, but a reliable test may not be a valid one
Different method for determining Reliability
 Test-retest method
– A test is administrated to the same group with short interval. The scores are tabulated and correlation is calculated. The higher the correlation, the more the reliability.
• Split-half method
– The scores of the odd and even items are taken and the correlation between the two sets of scores determined.
• Parallel form method
– Reliability is determined using two equivalent forms of the same test content.
– These prepared tests are administrated to the same group one after the other.
– The test forms should be identical with respect to the number of items, content, difficult level etc.
– Determining the correlation between the two sets of scores obtained by the group in the two tests.
– If higher the correlation, the more the reliability.
Discriminating Power
• Discriminating power of the test is its power to discriminate between the upper and lower groups who took the test.
• The test should contain different difficulty level of questions.
Practicability 
• Practicability of the test depends up on...
• Administrative ease
• Scoring ease
• Interpretative ease
• Economy
Comparability
• A test possesses comparability when scores resulting from its use can be interpreted in terms of a common base that has a natural or accepted meanings

• There are two method for establishing comparability
– Availability of equivalent (parallel) form of test
– Availability of adequate norms
Utility 
• A test has utility if it provides the test condition that would facilitate realization of the purpose for which it is mean.


Evaluating science teaching and learning
         “True genius resides in the capacity for evaluation of uncertain, hazardous, and
conflicting information “ ~   Winston Churchill (1874-1965)
WHAT IS EVALUATION?
Evaluation in education is a reflective judgement on teaching practices, students’ learning and the learning  environment. It is an on-going process that enables the teacher to reflect on how well students are learning.
Evaluation can be viewed as a critical analysis towards the improvement of teaching practices. There are numerous aspects of teaching and learning that can be evaluated. Keep in mind that effective teachers constantly seek ways to enhance their teaching. Critically analysing teaching and learning provides a way forward. In teaching, evaluation can be synonymous with reflection. Evaluating teaching and learning can assist teachers to identify ways to raise the teaching standards, and hopefully the learning..
WHAT DO I EVALUATE?
You can evaluate anything within the teaching and learning environment. Your planning, teaching, activities and assessments need to be evaluated. You can evaluate your teaching approaches and strategies. Students’ learning must be evaluated. You may want to evaluate the resources, learning spaces, guest speakers, worksheets, and anything else that you think may make a difference to either your teaching or the students’ learning. Basic evaluation will include what worked, what didn’t work, and what needs to be considered for improving future teaching practices.
 Evaluating Teaching Plans and Practices
1. Plans
• Was sound lesson planning evident?
• Was the lesson outcome(s) stated? 
• Were links to the syllabus outlined?
• Was the lesson appropriately timetabled and timed?
• Was the prior knowledge of the students considered?
• Were teaching strategies outlined?
• Was content knowledge evident?
• Were resources prepared?
• Were classroom management strategies apparent?
• Were student activities suitable and appropriate?
• Were methods of assessing students’ learning outlined?
2. Practices
• Was I confident in teaching this lesson?
• Was I enthusiastic about teaching science?
• Did I stimulate students’ interests in the topic?
• Had I presented a well-designed lesson?
• Were my explanations clear and succinct?
• Did I ask a range of lower and higher-order questions?
• Had I catered for the range of student abilities?
• Did I hold the students’ attention when teaching?
• Had I developed a good rapport with students?
• Did I use effective classroom management strategies?
• Were consequences and rewards appropriate?
• Did I monitor students’ work?
• Had I displayed adequate science content knowledge?
• Did I use appropriate terminology from the science syllabus?
• Had I used sufficient hands-on materials?
• Did I allow students to record and communicate their new knowledge?
• Had I concluded the lesson with key scientific concepts?
• Was there a real-world connection to the lesson?
Evaluation needs to occur frequently. Every lesson should be evaluated in terms of what worked, what didn’t  work, and what could be improved for future practices. Experienced teachers will evaluate as they proceed  through a lesson and make adjustments accordingly. A teaching plan is a guide only. It’s a vision of what could  occur. When a lesson is in action, with the varying personalities and abilities of students, a plan may need to be changed. You will need to adapt to particular circumstances. This adaptation can also be part of your evaluation.
This will assist you to understand a learning environment. It will help you to devise other strategies for teaching in a similar situation. The unexpected behaviour for a beginning teacher can very well be expected by a more experienced teacher.  The experienced teacher knows what to expect because of continual evaluations of circumstances; critical self reflections.
WHO CAN EVALUATE?
The two key players who need to evaluate the teaching and learning are the students and the teacher. Other  potential evaluators can include the principal, executive staff, parents and other interested groups. Evaluation  should be viewed as a positive action. The outcome of evaluation should be improved teaching and learning. So the immediate stakeholders (you and your students) must have a say in how students learn.
Teacher
The classroom teacher should always evaluate lessons and units. Teachers can focus their attention on themselves  in evaluations. A teacher needs to be honest in an evaluation of teaching and learning.  Pretending a problem  doesn’t exist won’t make it go away. Despite careful planning, not every lesson will be successful as there are many things that could alter the direction of the lesson. Maybe there weren’t enough resources or that student  interaction wasn’t as anticipated. Regardless, a teacher learns from experiences and at the heart of this learning is  critical and open reflection. Highlighting what else may be required to make the unit of work more successful and  pinpointing areas that were successful will assist you to teach future classes more effectively.  It is up to the  teacher to decide what to include in an evaluation.  
Example provides an evaluation criteria for a unit based  around energy.
 The teacher has decided to evaluate the students’ learning of the key concepts and their  engagement in the unit. Although this evaluation is quite broad, responses to each criterion may provide insight towards enhancing the unit for future classes.
Student
It is useful to have students evaluate the teaching and learning environment, particularly as they are the focus of  the attention. Evaluation from students needs to be age appropriate. Those in the early primary grades may have  the evaluation read out to them, and they would record their responses by colouring in a face or a sign.  They may  be asked to circle pictures that represent lessons that they liked. They may be asked to draw a picture of a lesson  they enjoyed. Example 11.4 shows a range of evaluation forms that could be used at an early primary school level. 
Student written evaluation
1. Did you enjoy our unit of work about Fossils?  YES/NO.  Why or Why not?
2. What was your favourite part of the unit?
3. What was your least favourite part of the unit?
4. Did you find anything really difficult or really easy in the unit? Please explain here. 
5. If you could do the unit again, what else would you be interested in learning?
6. What did you learn during the unit?
7. Reflect on the unit, how did you feel at the start of the unit when you knew what you were going to be studying, and how do you now feel at the end given what you have learnt?
Evaluative questions for teachers and students
1. Were the overall standards of the unit achieved?
2. Did students engage effectively with the unit topic?
3. Did students develop an understanding of the key concepts being taught?
4. Were students able to achieve the learning outcomes in all lessons?
5. Was the duration of the unit: too long/too short/just right?
6. Were the needs of students of varying learning abilities accounted for?
7. Did you feel that you had researched the topic sufficiently?
8. Did my content knowledge enable me to answer student questions appropriately?
9. What would you change in this unit for teaching in the future?
10. How well did the students achieve the outcomes outlined in the unit?
11. Did the scaffolding of the lessons achieve maximum learning from the students?
12. Did the lessons flow from one to the next, at a pace that students could follow?
13. Did the activities in each lesson assist students’ awareness of the topic?
14. Were the activities and experiments in each lesson appropriate for this year level?
15. Were the resources appropriate for the unit of work?
16. How well was the integration of other KLAs used within the unit?
17. Were the students receptive to combining the content from this unit, with other KLA’s such as mathematics or drama?
18. How effective were the teaching strategies?
19. What teaching strategies worked and why?
20. How did the students respond to the learning approaches?
21. Which learning approaches worked best? How and Why?
22. Which learning approaches worked least? Why?  How could they be developed in the future?
23. Were the discussions and conversations substantive, meaningful, and promote conceptual knowledge?
24. What was the level of student engagement in the lessons?
25. In what areas was the lesson problematic and why?
26. How has this lesson informed my approach to teaching science?  
27. What will I do differently in the future?
28. How did I feel when students were engaged and interested in the science topic?
29. Were the questioning techniques used in the lessons effective and developed higher-order thinking?
30. Did the activities provide opportunities for the students to demonstrate core learning outcomes?
31. How was the overall flow of the unit?
32. Was the interactive approach to teaching effective and promoted more student engagement?
33. Were the assessment practices fair, varied, and link directly to the syllabus outcomes?
34. Were there some difficulties through the unit development? How were these problems and issues addressed?
35. Did the teaching strategies (e.g. group work, portfolios, Bybee’s 5Es) help achieve at the highest level?
36. What teaching strategies were effective, what is the evidence?
37. What did I learn by teaching this unit?
38. What did I learn about myself as a teacher by teaching this unit?
39. How has teaching this unit changed my teaching pedagogy?
40. What teaching approaches and strategies were effective, what is the evidence?
41. Did the outcome statements reflect the targeted learning outcomes?
42. Were the indicators suitable and did they reflect the targeted learning outcomes and the key concepts?
43. What teaching approaches and strategies were not effective?  Why? How can I change this next time?
44. How did I manage student behaviour? 
45. In the future, how could I approach teaching this science topic differently?
Teacher self evaluation
S.agree
agree
undecided
Disagree
St.disagree



































PLANNING OF UNIT
The topic of the unit was relevant to the students
Lessons were sequenced to allow students to learn most effectively.
Each lesson provided at least one key concept for students to learn.
The experiments and activities were relevant and engaging for students.
The worksheets were well structured and encouraged higher-order thinking.
Activities allowed for all students to participate and catered for all abilities.
Each lesson had an assessment component.
TEACHING OF UNIT
I included all students at all times
I ensured all students were engaged with the topic.
I asked questions that encouraged higher-order thinking.
I facilitated meaningful discussions.
I encouraged hands-on learning wherever possible.
I appeared enthusiastic at all times.
I provided the students with the content knowledge to complete activities.
I assessed all students fairly with a range of different activities throughout   unit.



27.2  ATTRIBUTES OF A GOOD TEST
The three main attributes of a good test are validity, reliability and usability.
Validity
Va1idity :- It is the most important characteristic of a good  test.  Validity of a test is the  extent to which  it  measures what it  attempts to measure.  That is to say  that a test should conform to the objectives of testing.  For  example, in  an  English Language Test where the  purpose of testing is to measure the students'  ability to  manipulate language structures, the following test item will not be valid:
Name the part of speech  of each  underlined word and  also state its  kind : His old typewriter should be in a museum.
This item is invalid because it is testing the students' knowledge about the language and not hislher ability to manipulate structures.
Though there are many types of validity, for a classrooln teacher it is enough to know ,  about the following three types:
1.  Face Validity
Face validity means that a test, even on a simple inspection, should  look  valid.  For example in  a language test the following item is totally invalid, as it  does  not test language but the computation skill of the students.
The  train starts from  New  Delhi  at  8:10 hours  and  reaches  Kanpur  at 14:30 hours.
How much time does the train take to reach Kanpur from New Delhi? 
To establish face validity of a test, the examiner or the teacher should go through the test and examine its content carefully.
2.  Content Validity
Content validity is very important in an achievement test as an achievement test tries to measure some specihc skills or abilities through some specific content.  To obtain content validity it is necessary thqt all the important areas of the course content are represented in the test and also.that the test covers all the instructional objectives. In other words, the test should contain questions on every important area of the content in  appropriate proportion and  the questions should be framed  in  such a way that all
the objectives of that course, are tested properly. Content validity can also be ensured by  analyzing  the course content  and  the  instructional objectives  to  be  achieved through it and   by taking the test items to both these things.
3.  Empirical Validity
Empirical validity is also known as statistical validity or criterion-related validity as to ensure this a criterion is taken (which may be a standardized test;ano,ther  teacher's ratings on a class test, students' scores on a previous test, or the sh~dents' grades on subsequent fmal examination, etc.) and the scores of students are correlated with their  scores on the criterion test.  If the scores correlate positively the test may be said to  have empirical validity. Empirical validity is important because it shows statistically that a test is valid, i.e, , it measures well what it intends to measure. ,
Reliability
It refers to the consistency with which a question paper measures the achievement of  students.  In  other words, if  the test  is  to  be reliable the chance errors  must be zero.
Unreliability occurs at two stages:
1.  Firstly,  at the  level  o f  examinee,  when  s h e  is  not  able to  understand  and interpret the question properly.  This may be due to vagueness in  language of  question or due to some other reason.  This can  be  removed if the questions  are pointed and free from ambiguity.
2.  Secondly,  at  the  level  of examiners.  In  the  absence  o f  standard marking scheme, examiners are free to interpret and mark the questions  in  their own way.  This contributes greatly  to  unreliability.  A  detailed  marking scheme improves the reliability aspects  of the question  paper.  Objective type  and 
very short answer type questions are more reliable than essay-type questions.
Thus, by including these questions and also by  increasing the total  number of questions in  a question paper reliability can be increased.
Usability
Usability  or  practicability is  the  third  characteristic of  a  good  test.  There  are  a number of practical factors that are to  be considered  while preparing or selecting  a test for use.  ,
The first thing to be kept in  mind is that the test should  be of such a length that is can be administered within stipulated time.  If the test is too  long  or too short  it may not  be  practical to  use as a  classroom test.
Secondly, it  is to  be seen that the test is  easy to  administer and  that clear cut directions are provided  in  the test so that the testees as well as  the  test administrators  can  perform their  tasks  with  efficiency.
Mo'reover, the facilities available for administration should  also  be kept in  view as in  case of oral tests, tape recorders [nay be required.
If a teacher doesn't  have the facility of tape recorder, s/he should not  take up a test requiring the use of one.
Thirdly, scorability is also to be considered while using a test.  When large number  of students are  involved,  a  test which  can  be  scored quickly (preferably by machine) is to be selected but when only one -  class is to be tested, perhaps a test consisting of subjective  questions may also be used.
27.3  STEPS OF TEST CONSTRUCTION  -
Once the teacher or the test constructor is aware of the characteristics that a good test must possess, s/he dan proceed to construct a test, which may be either a unit test or a full-fledged question paper covering all the aspects of the syllabus.  Whether the test is  a  unit  test for  use  in  classroom testing or  a  question  paper  for  use  in  final examinations, the steps of test construction are the same, which are as follows:
1 ,   Prepare a Design
The first step  in  preparing a  test is  to construct a design.  A  test  is  not  merely  a collection of assorted questions,  To be of any effective use, it has to be planned in advance keeping in view the objectives and the content of the course and the forms of questions to  be  used  for testing these.  For  this  weightage to  different objectives, different areas of content, and different forms of questions are to  be decided, along with the scheme of options  and sections, and  these are the dimensions which 'are known as a design-!of a, test.
a.  Weightage to Objectives
To make a test valid, it is necessary to analyze the objectives of the course and decide  which objectives are toabe tested and in what properties.  For this marks are allotted to each objective t o  be tested according to its importance. 'In English language testing the three major objectives are knowledge of the elements of language, comprehension and  expression.  The  weightages to  all  these  three objectives  may  be  decided  in percentages, For example for a test of 50 marks the following waightages may  be decided.
b.  Weightage to different areas of Content
1 It  is  necessary  to  analyze the syllabus  and  allot  weightages  to  different areas of
Total content.  This is again done to endure the validity of the test.  A hypothetical example
is given below for an English language test showing weightages to content units for a
class XI test.
After analyzing the objectives and. the content, it  is to be  seen how they are to be  tested.  A  particular objective  and  content can  be  tested  more  appropriately  by  a particular form of  questio~is. So, different f o r m  of questions  are to  be included  in the test for testing different objectives and  contents.  For  this a  number of  different  types of questions to be included in the test and tlie marks carried by  each of them are decided.  This takes care o f t h e  reliability of test.
As an  illustration, hypothetical  weightage to  different forms  of questions  in  our  50 [narks question paper for class XI  is given below:
d.   Scheme of Sections
The design  of  a  question  paper  may also  indicate  the  scheme  of sections  for  the paper.  For example, a question paper may  consist of  both  multiple choice questions and supply type questions.  Such  a test  may  have  two sections, one  consisting  of multiple choice questions and the other consisting of supply type questions  like essay  
type, s11o1-t answer and very short answer type questions.  I n  case the examiner wants,  the  question  paper  can  also  be  divided  into sections  area wise  like  one section for reading comprehension, another for writing tasks, third for grammar and soon.  I f   the multiple choice questions are  not substantial  i n   number,  there  is  no need  to  keep  a separate section.
e.  Scheme of Options
The design may indicate the pattern of options i.e, the complete elimination of overall options or retention of internal options within limits.  No options  are  to  be  provided in  case of multiple choice, short answer and  very  s11oi.t  answer  questions; for essay type questions the  teacher  may  like  to  provide internal  options.  While providing options, it may  be kept in  mind that the options are comparable in   terms of objectives  to  be tested, the form of questions and  the difficulty level  of the questions. As far as possible, the major area of content should also be the same  i n  the options.
While  planning the  paper  it  should  be  so  planned  that  the  difficult    level  of  the questions varies so as to cater to all the students of the class and also to discriminate between  high  achievers and  low  achievers.  The suggested  percentage for easy  and difficult questions  is  20%  whereas average  questions can  be  60?4.  The difficulty level of the test paper can be varied according to the level of the students.  If the class has a large  number of  good students, then  25%  to  30%  dil'fici~lt questions  can  be given
2 ,   Preparing a Blue Print
After deciding on the design of the test, the blue print is prepared. The blueprint is a three dimensional chart which shows the placement of each question in respect of the objective and the content area that it tests.  It also indicates the marks carried by each question.  It is  useful  to  prepare a blue print so  that the  test maker knows  which question will  test which objective and which content unit and how  many marks it would  carry.  Without a blue print only  the weightage  are decided for  objectives, content  areas and types  o f   questions.  The  blue  print concretizes  the  design  in operational terms and all the dimensions of a question (i.e. its objective, its form, the content  area  it  would cover  and the marks allotted to  it) become clear  to  the  test maker.
There is no set  procedure for  preparing  a  blue print.  However, the following sequential steps would help in preparing a good blue print.
Transfer the decisions regarding weightages  to  objectives - Knowledge, Comprehension and Expression on the given proforma.
Transfer the weightages already decided for different content units.  For this, list the content units under the content areas in the column given at the left hand and the marks under the column of total given at the right hand side.
Place the essay type questions first in the blue print. Place them under the objectives which you want to test through these questions.  The marks of the questions may be shown in the column under the objectives and the number of questions may be given in brackets.
If in a question, marks are to be split between two objectives indicate it with asterisks and a dotted line as shown in the example.
After placing the essay type questions, place the short answer type questions under the objectives and beside the content unit that you want to test through them.
Place the very short answer type questions in a similar way.
Place the  multiple choice questions  in  the same  way - marks outside the bracket, number o f  questions inside the-bracket.
Calculate the subtotals of all the questions under all the objectives.
Calculate the  totals.  Your total should tally  with  the weightages  of objectives and content units that you had already marked on the blue print. 
Fill in the summary of types of questions, Scheme of Sections and Scheme of Options.
Prepare questions based on the blue print Guideline for Setting a Good Question Paper
After the blue print is ready, questions are to be prepared according to the dimensions defined in the blueprint. For example, if there are essay type questions to be prepared to test the writing skills, one letter and one report and also a short answer question on writing a notice, the test constructor should prepare these three questions along with their options which may be comparable in terms of objectives to be  tested, content areas, forms of questions and the difficulty level.  .
While preparing questions it must be kept in mind that the question:
-  is based on the specific objective of teaching as indicated in the blue print
-relates t o  the specific content area as per the blue print
- is written  in  the form  as required by  the blue print and satisfies all the rules for framing that form of questions
-is at the desired level of difficulty

- is  written  in  clear,  correct  and  precise language  which  is  well  within  the comprehension of pupils
- clearly indicates the scope and length of the answer.
Another thing to be kept  in  view while writing questions is to prepare the answers  simultaneously because quite often the answers help in refining the questions.
4.  Assembling the Question Paper
After the questions  are prepared, they are to be assembled in  a question paper form, For  this, instructions  are to  be  written. General instructions for the paper may  be given on top whereas instructions for specific questions may be given just before the questions.
The order  of  questions  is  also to  be  decided while assembling the question paper Sometimes it is according to the forms of questions, i.e., objective type questions may  be put first, then very short answer, short answer and essay type questions or it may be according to the content as in the case of a language question paper where we may have structure questions first, then questions on unseen passage and then composition  questions.
The assembling and editing of the question paper is important from the point of view of  administration.  For example, if the question is divided into two sections, one of which is t o  be collected within a specific time limit, clear instructions to do so should be  mentioned and  also the  arrangement  of  questions should  be such  that  both  the sections are easily demarcated.
5.   Preparing the Scoring Key and the Marking Scheme
Scoring key is to be prepared for objective 'type questions and the marking scheme for other questions.
The scoring key gives the alphabet of the correct answer and the marks  carried  by each question. The marking scheme gives the expected outline answer and the value points for each aspect of the answer.
Detailed instructions for marking are also worked out, e.g., in marking compositions, etc.  It is specified as to how many marks are to be deducted for spelling mistakes or structural mistakes, or if the composition is to be graded, how it is to  be done and on what basis.
The detailed marking  scheme  is  necessary to ensure consistency and uniformity  in scoring by different examiners.  In other words it ensures reliability of scoring.
6.  Preparing Question-wise Analysis
After the question paper and marking scheme are finished, it is desirable to prepare a question-wise analysis. This analysis helps in tallying the questions in  the test with the  blue print,  It  also enables  us to know the strengths and  weaknesses o f  the test better, eg., through-the analysis we can know how many topics have been covered in the syllabus, what is the difficulty level of each question and what specifications are being tested by each question.
  The analysis is done on following points:
1. Number of the question.
2. Objective tested by the question.
3. Specification on which the question is based.
4. Topic covered.
5. Form of the question.
6. Marks allotted.
7. Approximate time required for answering.
8. Estimated difficulty level.



                              DIAGNOSTIC TEACHING      
Diagnostic teaching is the “process of diagnosing student abilities, needs and objectives and prescribing requisite learning activities”.   Through diagnostic teaching, the teacher monitors the understanding and performance of students before teaching the lesson, while teaching, and after teaching the lesson. Diagnostic teaching can inform teachers of the effectiveness of their lessons with individuals, small groups of students, or whole classes, depending on the instruments used. Within a diagnostic teaching perspective, assessment and instruction are interacting and continuous processes, with assessment providing feedback to the teacher on the efficacy of prior instruction, and new instruction building on the learning that students demonstrate.
Teachers may evaluate student learning on the spot, or collect data at different points in time and compare progress over units of instruction. Moment-by-moment assessments allow teachers to tap into students’ developing understandings about reading and students’ use of strategic processing to understand and remember text, and enable teachers to correct misconceptions immediately. Observations recorded over time allow teachers to identify patterns of development and document learning gains.  Both “on-the-run” assessments and systematic records of teachers’ observations of students’ learning over time can supplement the more quantitative and summative assessments that the ministry or school mandates and are more likely than end-of-term assessments to develop teachers’ capacity to improve the quality and appropriateness of instruction.  
Diagnostic assessments are themselves educative for teachers.   By introducing the concept of diagnostic teaching and the monitoring techniques to support such instruction, teachers will be better able to recognize reading as a developmental process and target instruction to meet the needs of individuals and groups.  As students progress toward reading proficiency, they gain control over different components of the reading process.  Yet, not all students will be at the same level of proficiency or need the same instruction.  Students progress through overlapping stages in a developmental sequence that leads to proficiency in reading.  Starting with “visual-cue” word recognition, wherein students memorize the configuration of words, to increasing awareness of phonology and the way sounds map onto letters in an alphabetic language, students gradually consolidate their use of larger letter patterns to recognize words effortlessly and automatically.  At the “automatic word recognition” stage, students are able to orally read text fluently, with speed and prosody.  As students become fluent readers, they are likely to devote more attention to comprehension, routinely using background knowledge and strategic processing to understand and remember text.  By the time students reach “reading proficiency,” their reading comprehension equals or surpasses what they can glean from listening to lecture presentations. Nonetheless, not all students’ progress through these stages at the same rate, and in any given classroom, there will be students who need different kinds of support from their teachers. For example, some students may be able to decode words but only slowly and with great effort.  Others may be fluent word callers but lack vocabulary and the ability to read strategically for comprehension.  Thus, students’ vocabulary, background knowledge, fluency, interest and motivation, as well as the ability to accurately identify words, all influence their reading comprehension. 
Through professional development, teachers will be able to recognize the importance of the various components of the reading process and identify and use assessment and instruction to support the development of these components.  Teachers who participate in the diagnostic teaching workshops will have a more elaborated view of the reading process, beyond students’ ability to decode words and memorize text.  In particular, teachers will recognize the importance of automatic word recognition, that is, the ability to read high frequency words and phonetically regular words accurately and fluently, and the importance of strategic reading, that is, the ability to make inferences and monitor and repair understanding when reading different genres of text. 
The diagnostic teaching techniques presented in the workshops cover the entire continua of reading development. Teachers will be able to demonstrate how to assess and support students’ emerging reading behaviors, such as concepts about print and basic decoding, among the youngest or least experienced readers.   By using a fluency rating rubric, such as that developed for the National Assessment of Educational Progress, and determining students’ words correct per minute on an oral reading of a short passage, teachers also will be able to identify more experienced students’ fluency and put into practice activities to support automatic word recognition, such as dramatic readings or rereadings of text.  Teachers will demonstrate how to assess and instruct students in comprehension strategies before, during and after reading.  Before reading, teachers will assess whether students can preview, set a purpose for reading, and bring prior knowledge to bear on the topic of the reading.  During reading, teachers will assess students’ ability to develop predictions and questions, and monitor text understanding.  After reading, teachers will assess whether students can remember or summarize what they read.  Teachers will learn as well to analyze and interpret non-traditional ways of processing and responding to text, such as verbal protocols and drawings that make visible the thinking of students for teachers to evaluate.
As teachers learn to administer and interpret these diagnostic assessments, they themselves will develop a more elaborated understanding of the reading process.  In order to assess comprehension of narrative text and the adequacy of students’ retellings, for example, teachers must themselves learn to identify the story elements of character, plot, setting, problem and resolution.  Likewise, in order to evaluate whether students are able to navigate nonfiction texts, teachers first must be able to recognize the organizational patterns in particular informational texts, for example, cause and effect, or comparison and contrast, and then identify and teach the comprehension strategies that students are not using.


GRAPHICAL REPRESENTATION OF DATA
The statistical data may be presented in a more attractive form appealing to the eye with the help of some graphic aids, i.e. Pictures and graphs. Such presentation carries a lot of communication power. A mere glimpse of thee picture and graphs may enable the viewer to have an immediate and meaningful grasp of the large amount of data.
Ungrouped data may be represented through a bar diagram, pie diagram, pictograph and line graph.
· Bar graph represents the data on the graph paper in the form of vertical or horizontal bars.
Graphical representation of data helps in faster and easier interpretation of data.
bar graph uses bars or rectangles of the same width but different heights to represent different values of data.
In a bar graph:
  1. The bars have equal gaps between them.
  2. The width of the bars does not matter.
  3. The height of the bars represents the different values of the variable.
   In a pie diagram, the data is represented by a circle of 360degrees into parts, each representing the amount of data converted into angles. The total frequency value is equated to 360 degrees and then the angle corresponding to component parts are calculated.
Pie charts are useful to compare different parts of a whole amount.  They are often used to present financial information.  E.g. A company's expenditure can be shown to be the sum of its parts including different expense categories such as salaries, borrowing interest, taxation and general running costs (i.e. rent, electricity, heating etc).
A pie chart is a circular chart in which the circle is divided into sectors.  Each sector visually represents an item in a data set to match the amount of the item as a percentage or fraction of the total data set.
A family's weekly expenditure on its house mortgage, food and fuel is as follows:
Draw a pie chart to display the information.
Solution:
We can find what percentage of the total expenditure each item equals.
Percentage of weekly expenditure on:
To draw a pie chart, divide the circle into 100 percentage parts.  Then allocate the number of percentage parts required for each item.
Note:
  • It is simple to read a pie chart.  Just look at the required sector representing an item (or category) and read off the value.  For example, the weekly expenditure of the family on food is 37.5% of the total expenditure measured.
  • A pie chart is used to compare the different parts that make up a whole amount.
 In pictograms, the data is represented by means of picture figures appropriately designed in proportion to the numerical data.
· Line graphs represent the data concerning one variable on the horizontal and other variable on the vertical axis of the graph paper.
line graph is often used to represent a set of data values in which a quantity varies with time.  These graphs are useful for finding trends.  That is, finding a general pattern in data sets including temperature, sales, employment, company profit or cost over a period of time.
A cylinder of liquid was heated.  Its temperature was recorded at ten-minute intervals as shown in the following table.
a.  Draw a line graph to represent this information.
b.  Estimate the temperature of the cylinder after 25 minutes of heating.
Solution:
Grouped data may be represented graphically by histogram, frequency polygon, cumulative frequency graph and cumulative frequency percentage curve or ogive.
· A histogram is essentially a bar graph of a frequency distribution. The actual class limits plotted on the x-axis represents the width of various bars and respective frequencies of these class intervals represent the height of these bars.
· A frequency polygon is a line graph for the graphical representation of frequency distribution.
· A cumulative frequency graph represents the cumulative frequency distribution by plotting actual upper limits of the class intervals on the x axis and the respective cumulative frequencies of these class intervals on the y axis.
· Cumulative frequency percentage curve or ogive represents cumulative percentage frequency distribution by plotting upper limits of the class intervals on the x axis and the respective cumulative percentage frequencies of these class intervals on the y axis.
METHOD FOR CONSTRUCTING
A HISTOGRAM
1. The scores in the form of actual class limits as 19.5-24.5, 24.5-29.5 and so on are taken as examples in the construction of a histogram rather than written class limits as 20-24, 25-30.
2. It is customary to take two extra intervals of classes one below and above the grouped intervals.
3. Now we take the actual lower limits of all the class intervals and try to plot them on the x axis. The lower limit of the lowest class interval is taken at the intersecting point of x axis and y axis.
4. Frequencies of the distribution are plotted on the y axis.
5. Each class interval with its specific frequency is represented by separate rectangle. The base of each rectangle is the width of the class interval. And the height is representative of the frequency of that class or interval.
6. Care should be taken to select the appropriate units of representation along the x and y axis. Both the axis and the y axis must not be too short or too long.
In a histogram:
  1. The bars do not have gaps between them.
  2. The width of the bars is proportional to the class intervals of data.
  3. The height of the bars represents the different values of the variable.
  4. The area of each rectangle is proportional to its corresponding frequency.
The area of a histogram is equal to the area enclosed by its corresponding frequency polygon.

METHOD FOR CONSTRUTING
A FREQUENCY POLYGON
1. As in histogram two extra class interval is taken, one above and other below the given class interval.
2. The mid-points of the class interval is calculated.
3. The mid point is calculated along the x axis and the corresponding frequencies are plotted along the y axis.
4. The various points given by the plotting are joined by lines to give frequency polygon.



Midpoints of the interval of corresponding rectangle in a histogram are joined together by straight lines. It gives a polygon i.e. a figure with many angles. it is used when two or more sets of data are to be illustrated on the same diagram such as death rates in smokers and non smokers, birth and death rates of a population etc
One way to form a frequency polygon is to connect the midpoints at the top of the bars of a histogram with line segments (or a smooth curve). Of course the midpoints themselves could easily be plotted without the histogram and be joined by line segments. Sometimes it is beneficial to show the histogram and frequency polygon together.
Unlike histograms, frequency polygons can be superimposed so as to compare several frequency distributions.
This is a histogram with an overlaid frequency polygon.
Midpoints of the interval of corresponding rectangle in a histogram are joined

DIFFERENCE BETWEEN HISTOGRAM AND FRQUENCY POLYGON
Histogram is a bar graph while frequency polygon is a line graph. Frequency polygon is more useful and practical. In frequency polygon it is easy to know the trends of the distribution; we are unable to do so in histogram. Histogram gives a very clear and accurate picture of the relative proportion of the frequency from interval to interval.

METHOD FOR CONSTRUTING
A CUMULATIVE FREQUENCY GRAPH
1. First of all we calculate the actual upper and lower limits of the class intervals i.e. if the class interval is 20-24 then upper limit is 24.5 and the lower limit is 19.5.
2. We must know select a suitable scale as per the range of the class interval and plot the actual upper limits on the x axis and the respective cumulative frequency on y axis.
3. All the plotted points are then joined by successive straight lines resulting a line graph.
4. To plot the origin of the x axis an extra class interval is taken with cumulative frequency zero is taken.

1 comment: