Physical Science: Unit-9 Evaluation

Achievement test

A standardized test used to measure acquired learning, in a specific subject area such as reading or arithmetic, in contrast to an intelligence test, which measures potential ability or learning capacity.

Achievement test is an examination administered to determine how much a person has learned, or how much knowledge a person has or acquired. (noun)

An example of an achievement test is the Regent's exam that students must take to prove they have learned math and other academic lessons.

Standardized test

• Diagnostic Tests for Preschoolers -After a screening, tests for diagnostic assessment are administered (if needed).

• Adaptive behavior measures assess possible learning, social or motor disabilities.

• Intelligence tests measure learning potential.

• Achievement Tests Determine Instructional Effectiveness

National tests:

• compare student achievement across states to address higher standards for education

• identify poor instructional areas, pinpoint weaknesses in a state’s instructional program and facilitate improvements

State developed tests are used at school districts to:

determine each student’s progress

provide diagnostic information on a child’s needs for future instruction

describe student progress between and within schools

Standardized Tests and Teaching

• The Nature of Standardized Tests

• Steps in Standardized Test Design

The following steps ensure that the test achieves its goals

and purposes:

· specify the purpose

· determine the format

· formulate objectives

· test construction: write, try out, and analyze items

· assemble the final form

· administer the final test form

· establish norms, determine the validity and reliability

· develop a test manual

Steps in Standardized Test Design: Specifying the Purpose

A clearly defined purpose is the framework for the construction of the test.

• It allows evaluation of the instrument when design and construction steps are completed.

• It helps explain what the test will measure, how the test results will be used, and who will take the test.

• It describes the population for whom the test is intended.

• Steps in Standardized Test Design: Determining Test Format

Format decisions are based on the purpose of the test and the characteristics of the test takers:

• how test items will be presented and how the test taker will respond

(e.g., tests designed for very young children are usually presented orally; paper and pencil tests for older students)

• given as a group test or as an individual test

• Steps in Standardized Test Design: Test Construction

The test’s purpose guides:

• defining test objectives

• writing test items for each objective

• assembling experimental test forms

• Steps in Standardized Test Design: Developing Experimental Forms

For a school achievement test:

• test content is delimited

• curriculum is analyzed to ensure that the test will reflect the instructional programs

• teachers and curriculum experts review content outlines and objectives for the test; and later they review test items

• writing, editing, trying out, and rewriting or revising test items

• a preliminary test with selected test items is assembled for trial with a sample of students

• Steps in Standardized Test Design: Developing Experimental Forms

• Experimental test forms resemble the final form

• Instructions are written for test administration

• The sample of people who to take the preliminary test is similar to the population that will take the final form of the test

• Steps in Standardized Test Design: Item Analysis in The Test Tryout Phase

Study each item’s:

• Difficulty level: how many test takers in the trial group answered the question correctly

• Discrimination: the extent to which the question distinguishes between test takers who did well or poorly; test takers who did well on the test should be more successful on the item than those who did poorly

• Grade progression in difficulty: for tests that are taken in different grades, in each successively higher grade, a greater percentage of students should answer it correctly

• Steps in Standardized Test Design: The Final Test Form Is Assembled

• In item analysis, test items were revised, or eliminated

• Test items that measure each test objective are selected for the test

• Alternative forms of one test must ensure that each of the forms are equivalent in content and difficulty

Test directions are finalized-- with instructions for test administrators about the testing environment and testing procedures; and instructions for test takers

• Steps in Standardized Test Design:

Standardizing the Test

The final test form is administered to another, larger sample of test takers to acquire norm data.

Norms allow for comparisons of children’s test performance with the performance of a reference or norming group.

• The norming group is chosen to reflect the makeup of the population for whom the test is designed.

Evaluating Standardized Tests

Reliability – Are test scores stable, dependable and relatively free from error?

Validity – Does the test measure what it is supposed to measure?

Correlation

Correlation Coefficient is a statistical measure of relationship between two variables.

• Pearson correlation coefficient

• r = the Pearson coefficient

• r measures the amount that the two variables (X and Y) vary together (i.e., covary) taking into account how much they vary apart

• Pearson’s r is the most common correlation coefficient; there are others.

• Computing the Pearson correlation coefficient

• To put it another way:

Sum of Products of Deviations

• Measuring X and Y individually (the denominator):

– compute the sums of squares for each variable

• Measuring X and Y together: Sum of Products

– Definitional formula

– Computational formula

• n is the number of (X, Y) pairs

Correlation Coefficent:

• the equation for Pearson’s r:

• expanded form:

Example

• What is the correlation between study time and test score:

• Calculating values to find the SS and SP:

• Calculating SS and SP

• Calculating r

• Correlation Coefficient Interpretation

Reliability

• Test-retest: The extent to which a test yields the same score when given to a student on two different occasions

• Alternate-forms: Two different forms of the same test on two different occasions to determine the consistency of the scores

• Split-half: Divide the test items into two halves; scores are compared to determine test score consistency

• Standard Error of Measurement

• an estimate of the amount of variation to be expected in test scores.

• If the reliability correlations are poor, the standard error of measurement will be large.

• The larger the standard error of measurement, the less reliable the test.

• Variables That Affect the Standard Error of Measurement

The following affect test reliability:

• Population sample size --the larger the population sample, the more reliable the test

• Test length --longer tests are usually more reliable because there are more test items, resulting in a better sample of behaviors

• Range of test scores of the forming group --the wider the spread of scores, the more reliably the test can distinguish between good and poor students

• Types of Validity…

• Content: Test’s ability to sample the content that is being measured

• Criterion-related:

1. Concurrent: The relation between a test’s score and other available criteria

2. Predictive: The relationship between test’s score and future performance

• Construct: The extent to which there is evidence that a test measures a particular construct

• Considerations in Choosing and Evaluating Tests

To select the best test to meet the developmental characteristics of young children, the following need to be considered:

• the purpose of the testing

• the characteristics to be measured

• how the test results will be used

• the qualifications of those who will interpret the scores and use the results

• any practical constraints: cost, time, ease of scoring and use of test results

Reviewing the Test Manual

The test manual should include information that is adequate for users to determine whether the test is practical and suitable for their purposes.

The manual should address the following:

• Purpose of the test

• Test design

• Establishment of validity and reliability

• Test administration and scoring

DIAGNOSTIC TEACHING

Diagnostic teaching is the “process of diagnosing student abilities, needs and objectives and prescribing requisite learning activities”. Through diagnostic teaching, the teacher monitors the understanding and performance of students before teaching the lesson, while teaching, and after teaching the lesson. Diagnostic teaching can inform teachers of the effectiveness of their lessons with individuals, small groups of students, or whole classes, depending on the instruments used. Within a diagnostic teaching perspective, assessment and instruction are interacting and continuous processes, with assessment providing feedback to the teacher on the efficacy of prior instruction, and new instruction building on the learning that students demonstrate.

Teachers may evaluate student learning on the spot, or collect data at different points in time and compare progress over units of instruction. Moment-by-moment assessments allow teachers to tap into students’ developing understandings about reading and students’ use of strategic processing to understand and remember text, and enable teachers to correct misconceptions immediately. Observations recorded over time allow teachers to identify patterns of development and document learning gains. Both “on-the-run” assessments and systematic records of teachers’ observations of students’ learning over time can supplement the more quantitative and summative assessments that the ministry or school mandates and are more likely than end-of-term assessments to develop teachers’ capacity to improve the quality and appropriateness of instruction.

Diagnostic assessments are themselves educative for teachers. By introducing the concept of diagnostic teaching and the monitoring techniques to support such instruction, teachers will be better able to recognize reading as a developmental process and target instruction to meet the needs of individuals and groups. As students progress toward reading proficiency, they gain control over different components of the reading process. Yet, not all students will be at the same level of proficiency or need the same instruction. Students progress through overlapping stages in a developmental sequence that leads to proficiency in reading. Starting with “visual-cue” word recognition, wherein students memorize the configuration of words, to increasing awareness of phonology and the way sounds map onto letters in an alphabetic language, students gradually consolidate their use of larger letter patterns to recognize words effortlessly and automatically. At the “automatic word recognition” stage, students are able to orally read text fluently, with speed and prosody. As students become fluent readers, they are likely to devote more attention to comprehension, routinely using background knowledge and strategic processing to understand and remember text. By the time students reach “reading proficiency,” their reading comprehension equals or surpasses what they can glean from listening to lecture presentations. Nonetheless, not all students’ progress through these stages at the same rate, and in any given classroom, there will be students who need different kinds of support from their teachers. For example, some students may be able to decode words but only slowly and with great effort. Others may be fluent word callers but lack vocabulary and the ability to read strategically for comprehension. Thus, students’ vocabulary, background knowledge, fluency, interest and motivation, as well as the ability to accurately identify words, all influence their reading comprehension.

Through professional development, teachers will be able to recognize the importance of the various components of the reading process and identify and use assessment and instruction to support the development of these components. Teachers who participate in the diagnostic teaching workshops will have a more elaborated view of the reading process, beyond students’ ability to decode words and memorize text. In particular, teachers will recognize the importance of automatic word recognition, that is, the ability to read high frequency words and phonetically regular words accurately and fluently, and the importance of strategic reading, that is, the ability to make inferences and monitor and repair understanding when reading different genres of text.

The diagnostic teaching techniques presented in the workshops cover the entire continua of reading development. Teachers will be able to demonstrate how to assess and support students’ emerging reading behaviors, such as concepts about print and basic decoding, among the youngest or least experienced readers. By using a fluency rating rubric, such as that developed for the National Assessment of Educational Progress, and determining students’ words correct per minute on an oral reading of a short passage, teachers also will be able to identify more experienced students’ fluency and put into practice activities to support automatic word recognition, such as dramatic readings or rereadings of text. Teachers will demonstrate how to assess and instruct students in comprehension strategies before, during and after reading. Before reading, teachers will assess whether students can preview, set a purpose for reading, and bring prior knowledge to bear on the topic of the reading. During reading, teachers will assess students’ ability to develop predictions and questions, and monitor text understanding. After reading, teachers will assess whether students can remember or summarize what they read. Teachers will learn as well to analyze and interpret non-traditional ways of processing and responding to text, such as verbal protocols and drawings that make visible the thinking of students for teachers to evaluate.

As teachers learn to administer and interpret these diagnostic assessments, they themselves will develop a more elaborated understanding of the reading process. In order to assess comprehension of narrative text and the adequacy of students’ retellings, for example, teachers must themselves learn to identify the story elements of character, plot, setting, problem and resolution. Likewise, in order to evaluate whether students are able to navigate nonfiction texts, teachers first must be able to recognize the organizational patterns in particular informational texts, for example, cause and effect, or comparison and contrast, and then identify and teach the comprehension strategies that students are not using.

Achievement tests are exams that are designed to determine the degree of knowledge and proficiency exhibited by an individual in a specific area or set of areas. An achievement test is sometimes administered as part of the acceptance process into an educational program

In other applications, the achievement test serves as a tool to measure current knowledge levels for the purpose of placing students in an educational environment where they have the chance to advance at a pace that is suitable for their abilities.

Achievement is testing to identify students who are prepared to move on to more advanced courses of study or who need some type of remedial instruction. The idea behind using an achievement test format to measure the grade level for each student is not intended to reflect on the general intelligence of the individual. Rather, the purpose of the testing is to ensure each student is placed in a classroom situation where there is the best opportunity for learn and assimilate data in an organized fashion that prepares them for moving on to more advanced material.

For example, a student who does not do well with basic mathematics on an achievementtest is likely to be placed in a remedial learning situation. Doing so provides the student with the opportunity to master the basics before attempting to learn more advanced mathematical concepts like algebra or geometry. At a later date, the student will have the chance to take a second test; should the results indicate that the student is sufficiently prepared to move on to something more complicated, he or she can be reassigned to a more challenging course of study.

Individuals and groups that oppose the use of an achievement test claim that the exams are not structured in a manner that accounts for the general aptitude of each student, resulting in the creation of an overall learning environment that pigeonholes rather than nurtures each student and promote productive learning.

It is not unusual for many school jurisdictions to make use of a sample achievement test several weeks before administering the live exam. The idea behind the achievement test practice run is to allow students to get an idea of the general format of the exam and what type of instructions apply to each section. While the specific questions used in the sample are different those utilized in the live achievement test, they are usually close enough to provide the student with an idea of what to expect.

QUALITIES OF A GOOD TEST

A good test should possess the following qualities.

• Objectivity
• Objective Basedness
• Comprehensiveness
• Validity
• Reliability
• Practicability
• Comparability
• Utility

Objectivity
• A test is said to be objective if it is free from personal biases in interpreting its scope as well as in scoring the responses.

• Objectivity of a test can be increased by using more objective type test items and the answers are scored according to model answers provided.
Objective Basedness
• The test should be based on pre-determined objectives.

• The test setter should have definite idea about the objective behind each item.
Comprehensiveness
• The test should cover the whole syllabus.
• Due importance should be given all the relevant learning materials.
• Test should be cover all the anticipated objectives.
Validity
• A said to be valid if it measures what it intends to measure.

• There are different types of validity:
– Operational validity
– Predictive validity
– Content validity
– Construct validity
• Operational Validity
– A test will have operational validity if the tasks required by the test are sufficient to evaluate the definite activities or qualities.
• Predictive Validity
– A test has predictive validity if scores on it predict future performance
• Content Validity
– If the items in the test constitute a representative sample of the total course content to be tested, the test can be said to have content validity.
• Construct Validity
– Construct validity involves explaining the test scores psychologically. A test is interpreted in terms of numerous research findings.
Reliability
• Reliability of a test refers to the degree of consistency with which it measures what it indented to measure.

• A test may be reliable but need not be valid. This is because it may yield consistent scores, but these scores need not be representing what exactly we want to measure.

• A test with high validity has to be reliable also. (the scores will be consistent in both cases)

• Valid test is also a reliable test, but a reliable test may not be a valid one
Different method for determining Reliability
• Test-retest method
– A test is administrated to the same group with short interval. The scores are tabulated and correlation is calculated. The higher the correlation, the more the reliability.
• Split-half method
– The scores of the odd and even items are taken and the correlation between the two sets of scores determined.
• Parallel form method
– Reliability is determined using two equivalent forms of the same test content.
– These prepared tests are administrated to the same group one after the other.
– The test forms should be identical with respect to the number of items, content, difficult level etc.
– Determining the correlation between the two sets of scores obtained by the group in the two tests.
– If higher the correlation, the more the reliability.
Discriminating Power
• Discriminating power of the test is its power to discriminate between the upper and lower groups who took the test.
• The test should contain different difficulty level of questions.
Practicability
• Practicability of the test depends up on...
• Administrative ease
• Scoring ease
• Interpretative ease
• Economy
Comparability
• A test possesses comparability when scores resulting from its use can be interpreted in terms of a common base that has a natural or accepted meanings

• There are two method for establishing comparability
– Availability of equivalent (parallel) form of test
– Availability of adequate norms
Utility
• A test has utility if it provides the test condition that would facilitate realization of the purpose for which it is mean.

Evaluating science teaching and learning

“True genius resides in the capacity for evaluation of uncertain, hazardous, and

conflicting information “ ~ Winston Churchill (1874-1965)

WHAT IS EVALUATION?

Evaluation in education is a reflective judgement on teaching practices, students’ learning and the learning environment. It is an on-going process that enables the teacher to reflect on how well students are learning.

Evaluation can be viewed as a critical analysis towards the improvement of teaching practices. There are numerous aspects of teaching and learning that can be evaluated. Keep in mind that effective teachers constantly seek ways to enhance their teaching. Critically analysing teaching and learning provides a way forward. In teaching, evaluation can be synonymous with reflection. Evaluating teaching and learning can assist teachers to identify ways to raise the teaching standards, and hopefully the learning..

WHAT DO I EVALUATE?

You can evaluate anything within the teaching and learning environment. Your planning, teaching, activities and assessments need to be evaluated. You can evaluate your teaching approaches and strategies. Students’ learning must be evaluated. You may want to evaluate the resources, learning spaces, guest speakers, worksheets, and anything else that you think may make a difference to either your teaching or the students’ learning. Basic evaluation will include what worked, what didn’t work, and what needs to be considered for improving future teaching practices.

Evaluating Teaching Plans and Practices

1. Plans

• Was sound lesson planning evident?

• Was the lesson outcome(s) stated?

• Were links to the syllabus outlined?

• Was the lesson appropriately timetabled and timed?

• Was the prior knowledge of the students considered?

• Were teaching strategies outlined?

• Was content knowledge evident?

• Were resources prepared?

• Were classroom management strategies apparent?

• Were student activities suitable and appropriate?

• Were methods of assessing students’ learning outlined?

2. Practices

• Was I confident in teaching this lesson?

• Was I enthusiastic about teaching science?

• Did I stimulate students’ interests in the topic?

• Had I presented a well-designed lesson?

• Were my explanations clear and succinct?

• Did I ask a range of lower and higher-order questions?

• Had I catered for the range of student abilities?

• Did I hold the students’ attention when teaching?

• Had I developed a good rapport with students?

• Did I use effective classroom management strategies?

• Were consequences and rewards appropriate?

• Did I monitor students’ work?

• Had I displayed adequate science content knowledge?

• Did I use appropriate terminology from the science syllabus?

• Had I used sufficient hands-on materials?

• Did I allow students to record and communicate their new knowledge?

• Had I concluded the lesson with key scientific concepts?

• Was there a real-world connection to the lesson?

Evaluation needs to occur frequently. Every lesson should be evaluated in terms of what worked, what didn’t work, and what could be improved for future practices. Experienced teachers will evaluate as they proceed through a lesson and make adjustments accordingly. A teaching plan is a guide only. It’s a vision of what could occur. When a lesson is in action, with the varying personalities and abilities of students, a plan may need to be changed. You will need to adapt to particular circumstances. This adaptation can also be part of your evaluation.

This will assist you to understand a learning environment. It will help you to devise other strategies for teaching in a similar situation. The unexpected behaviour for a beginning teacher can very well be expected by a more experienced teacher. The experienced teacher knows what to expect because of continual evaluations of circumstances; critical self reflections.

WHO CAN EVALUATE?

The two key players who need to evaluate the teaching and learning are the students and the teacher. Other potential evaluators can include the principal, executive staff, parents and other interested groups. Evaluation should be viewed as a positive action. The outcome of evaluation should be improved teaching and learning. So the immediate stakeholders (you and your students) must have a say in how students learn.

Teacher

The classroom teacher should always evaluate lessons and units. Teachers can focus their attention on themselves in evaluations. A teacher needs to be honest in an evaluation of teaching and learning. Pretending a problem doesn’t exist won’t make it go away. Despite careful planning, not every lesson will be successful as there are many things that could alter the direction of the lesson. Maybe there weren’t enough resources or that student interaction wasn’t as anticipated. Regardless, a teacher learns from experiences and at the heart of this learning is critical and open reflection. Highlighting what else may be required to make the unit of work more successful and pinpointing areas that were successful will assist you to teach future classes more effectively. It is up to the teacher to decide what to include in an evaluation.

Example provides an evaluation criteria for a unit based around energy.

The teacher has decided to evaluate the students’ learning of the key concepts and their engagement in the unit. Although this evaluation is quite broad, responses to each criterion may provide insight towards enhancing the unit for future classes.

Student

It is useful to have students evaluate the teaching and learning environment, particularly as they are the focus of the attention. Evaluation from students needs to be age appropriate. Those in the early primary grades may have the evaluation read out to them, and they would record their responses by colouring in a face or a sign. They may be asked to circle pictures that represent lessons that they liked. They may be asked to draw a picture of a lesson they enjoyed. Example 11.4 shows a range of evaluation forms that could be used at an early primary school level.

Student written evaluation

1. Did you enjoy our unit of work about Fossils? YES/NO. Why or Why not?

2. What was your favourite part of the unit?

3. What was your least favourite part of the unit?

4. Did you find anything really difficult or really easy in the unit? Please explain here.

5. If you could do the unit again, what else would you be interested in learning?

6. What did you learn during the unit?

7. Reflect on the unit, how did you feel at the start of the unit when you knew what you were going to be studying, and how do you now feel at the end given what you have learnt?

Evaluative questions for teachers and students

1. Were the overall standards of the unit achieved?

2. Did students engage effectively with the unit topic?

3. Did students develop an understanding of the key concepts being taught?

4. Were students able to achieve the learning outcomes in all lessons?

5. Was the duration of the unit: too long/too short/just right?

6. Were the needs of students of varying learning abilities accounted for?

7. Did you feel that you had researched the topic sufficiently?

8. Did my content knowledge enable me to answer student questions appropriately?

9. What would you change in this unit for teaching in the future?

10. How well did the students achieve the outcomes outlined in the unit?

11. Did the scaffolding of the lessons achieve maximum learning from the students?

12. Did the lessons flow from one to the next, at a pace that students could follow?

13. Did the activities in each lesson assist students’ awareness of the topic?

14. Were the activities and experiments in each lesson appropriate for this year level?

15. Were the resources appropriate for the unit of work?

16. How well was the integration of other KLAs used within the unit?

17. Were the students receptive to combining the content from this unit, with other KLA’s such as mathematics or drama?

18. How effective were the teaching strategies?

19. What teaching strategies worked and why?

20. How did the students respond to the learning approaches?

21. Which learning approaches worked best? How and Why?

22. Which learning approaches worked least? Why? How could they be developed in the future?

23. Were the discussions and conversations substantive, meaningful, and promote conceptual knowledge?

24. What was the level of student engagement in the lessons?

25. In what areas was the lesson problematic and why?

26. How has this lesson informed my approach to teaching science?

27. What will I do differently in the future?

28. How did I feel when students were engaged and interested in the science topic?

29. Were the questioning techniques used in the lessons effective and developed higher-order thinking?

30. Did the activities provide opportunities for the students to demonstrate core learning outcomes?

31. How was the overall flow of the unit?

32. Was the interactive approach to teaching effective and promoted more student engagement?

33. Were the assessment practices fair, varied, and link directly to the syllabus outcomes?

34. Were there some difficulties through the unit development? How were these problems and issues addressed?

35. Did the teaching strategies (e.g. group work, portfolios, Bybee’s 5Es) help achieve at the highest level?

36. What teaching strategies were effective, what is the evidence?

37. What did I learn by teaching this unit?

38. What did I learn about myself as a teacher by teaching this unit?

39. How has teaching this unit changed my teaching pedagogy?

40. What teaching approaches and strategies were effective, what is the evidence?

41. Did the outcome statements reflect the targeted learning outcomes?

42. Were the indicators suitable and did they reflect the targeted learning outcomes and the key concepts?

43. What teaching approaches and strategies were not effective? Why? How can I change this next time?

44. How did I manage student behaviour?

45. In the future, how could I approach teaching this science topic differently?

Teacher self evaluation

S.agree	agree	undecided	Disagree	St.disagree

PLANNING OF UNIT

The topic of the unit was relevant to the students

Lessons were sequenced to allow students to learn most effectively.

Each lesson provided at least one key concept for students to learn.

The experiments and activities were relevant and engaging for students.

The worksheets were well structured and encouraged higher-order thinking.

Activities allowed for all students to participate and catered for all abilities.

Each lesson had an assessment component.

TEACHING OF UNIT

I included all students at all times

I ensured all students were engaged with the topic.

I asked questions that encouraged higher-order thinking.

I facilitated meaningful discussions.

I encouraged hands-on learning wherever possible.

I appeared enthusiastic at all times.

I provided the students with the content knowledge to complete activities.

I assessed all students fairly with a range of different activities throughout unit.

27.2 ATTRIBUTES OF A GOOD TEST

The three main attributes of a good test are validity, reliability and usability.

Validity

Va1idity :- It is the most important characteristic of a good test. Validity of a test is the extent to which it measures what it attempts to measure. That is to say that a test should conform to the objectives of testing. For example, in an English Language Test where the purpose of testing is to measure the students' ability to manipulate language structures, the following test item will not be valid:

Name the part of speech of each underlined word and also state its kind : His old typewriter should be in a museum.

This item is invalid because it is testing the students' knowledge about the language and not hislher ability to manipulate structures.

Though there are many types of validity, for a classrooln teacher it is enough to know , about the following three types:

1. Face Validity

Face validity means that a test, even on a simple inspection, should look valid. For example in a language test the following item is totally invalid, as it does not test language but the computation skill of the students.

The train starts from New Delhi at 8:10 hours and reaches Kanpur at 14:30 hours.

How much time does the train take to reach Kanpur from New Delhi?

To establish face validity of a test, the examiner or the teacher should go through the test and examine its content carefully.

2. Content Validity

Content validity is very important in an achievement test as an achievement test tries to measure some specihc skills or abilities through some specific content. To obtain content validity it is necessary thqt all the important areas of the course content are represented in the test and also.that the test covers all the instructional objectives. In other words, the test should contain questions on every important area of the content in appropriate proportion and the questions should be framed in such a way that all

the objectives of that course, are tested properly. Content validity can also be ensured by analyzing the course content and the instructional objectives to be achieved through it and by taking the test items to both these things.

3. Empirical Validity

Empirical validity is also known as statistical validity or criterion-related validity as to ensure this a criterion is taken (which may be a standardized test;ano,ther teacher's ratings on a class test, students' scores on a previous test, or the sh~dents' grades on subsequent fmal examination, etc.) and the scores of students are correlated with their scores on the criterion test. If the scores correlate positively the test may be said to have empirical validity. Empirical validity is important because it shows statistically that a test is valid, i.e, , it measures well what it intends to measure. ,

Reliability

It refers to the consistency with which a question paper measures the achievement of students. In other words, if the test is to be reliable the chance errors must be zero.

Unreliability occurs at two stages:

1. Firstly, at the level o f examinee, when s h e is not able to understand and interpret the question properly. This may be due to vagueness in language of question or due to some other reason. This can be removed if the questions are pointed and free from ambiguity.

2. Secondly, at the level of examiners. In the absence o f standard marking scheme, examiners are free to interpret and mark the questions in their own way. This contributes greatly to unreliability. A detailed marking scheme improves the reliability aspects of the question paper. Objective type and

very short answer type questions are more reliable than essay-type questions.

Thus, by including these questions and also by increasing the total number of questions in a question paper reliability can be increased.

Usability

Usability or practicability is the third characteristic of a good test. There are a number of practical factors that are to be considered while preparing or selecting a test for use. ,

The first thing to be kept in mind is that the test should be of such a length that is can be administered within stipulated time. If the test is too long or too short it may not be practical to use as a classroom test.

Secondly, it is to be seen that the test is easy to administer and that clear cut directions are provided in the test so that the testees as well as the test administrators can perform their tasks with efficiency.

Mo'reover, the facilities available for administration should also be kept in view as in case of oral tests, tape recorders [nay be required.

If a teacher doesn't have the facility of tape recorder, s/he should not take up a test requiring the use of one.

Thirdly, scorability is also to be considered while using a test. When large number of students are involved, a test which can be scored quickly (preferably by machine) is to be selected but when only one - class is to be tested, perhaps a test consisting of subjective questions may also be used.

27.3 STEPS OF TEST CONSTRUCTION -

Once the teacher or the test constructor is aware of the characteristics that a good test must possess, s/he dan proceed to construct a test, which may be either a unit test or a full-fledged question paper covering all the aspects of the syllabus. Whether the test is a unit test for use in classroom testing or a question paper for use in final examinations, the steps of test construction are the same, which are as follows:

1 , Prepare a Design

The first step in preparing a test is to construct a design. A test is not merely a collection of assorted questions, To be of any effective use, it has to be planned in advance keeping in view the objectives and the content of the course and the forms of questions to be used for testing these. For this weightage to different objectives, different areas of content, and different forms of questions are to be decided, along with the scheme of options and sections, and these are the dimensions which 'are known as a design-!of a, test.

a. Weightage to Objectives

To make a test valid, it is necessary to analyze the objectives of the course and decide which objectives are toabe tested and in what properties. For this marks are allotted to each objective t o be tested according to its importance. 'In English language testing the three major objectives are knowledge of the elements of language, comprehension and expression. The weightages to all these three objectives may be decided in percentages, For example for a test of 50 marks the following waightages may be decided.

b. Weightage to different areas of Content

1 It is necessary to analyze the syllabus and allot weightages to different areas of

Total content. This is again done to endure the validity of the test. A hypothetical example

is given below for an English language test showing weightages to content units for a

class XI test.

After analyzing the objectives and. the content, it is to be seen how they are to be tested. A particular objective and content can be tested more appropriately by a particular form of questio~is. So, different f o r m of questions are to be included in the test for testing different objectives and contents. For this a number of different types of questions to be included in the test and tlie marks carried by each of them are decided. This takes care o f t h e reliability of test.

As an illustration, hypothetical weightage to different forms of questions in our 50 [narks question paper for class XI is given below:

d. Scheme of Sections

The design of a question paper may also indicate the scheme of sections for the paper. For example, a question paper may consist of both multiple choice questions and supply type questions. Such a test may have two sections, one consisting of multiple choice questions and the other consisting of supply type questions like essay

type, s11o1-t answer and very short answer type questions. I n case the examiner wants, the question paper can also be divided into sections area wise like one section for reading comprehension, another for writing tasks, third for grammar and soon. I f the multiple choice questions are not substantial i n number, there is no need to keep a separate section.

e. Scheme of Options

The design may indicate the pattern of options i.e, the complete elimination of overall options or retention of internal options within limits. No options are to be provided in case of multiple choice, short answer and very s11oi.t answer questions; for essay type questions the teacher may like to provide internal options. While providing options, it may be kept in mind that the options are comparable in terms of objectives to be tested, the form of questions and the difficulty level of the questions. As far as possible, the major area of content should also be the same i n the options.

While planning the paper it should be so planned that the difficult level of the questions varies so as to cater to all the students of the class and also to discriminate between high achievers and low achievers. The suggested percentage for easy and difficult questions is 20% whereas average questions can be 60?4. The difficulty level of the test paper can be varied according to the level of the students. If the class has a large number of good students, then 25% to 30% dil'fici~lt questions can be given

2 , Preparing a Blue Print

After deciding on the design of the test, the blue print is prepared. The blueprint is a three dimensional chart which shows the placement of each question in respect of the objective and the content area that it tests. It also indicates the marks carried by each question. It is useful to prepare a blue print so that the test maker knows which question will test which objective and which content unit and how many marks it would carry. Without a blue print only the weightage are decided for objectives, content areas and types o f questions. The blue print concretizes the design in operational terms and all the dimensions of a question (i.e. its objective, its form, the content area it would cover and the marks allotted to it) become clear to the test maker.

There is no set procedure for preparing a blue print. However, the following sequential steps would help in preparing a good blue print.

Transfer the decisions regarding weightages to objectives - Knowledge, Comprehension and Expression on the given proforma.

Transfer the weightages already decided for different content units. For this, list the content units under the content areas in the column given at the left hand and the marks under the column of total given at the right hand side.

Place the essay type questions first in the blue print. Place them under the objectives which you want to test through these questions. The marks of the questions may be shown in the column under the objectives and the number of questions may be given in brackets.

If in a question, marks are to be split between two objectives indicate it with asterisks and a dotted line as shown in the example.

After placing the essay type questions, place the short answer type questions under the objectives and beside the content unit that you want to test through them.

Place the very short answer type questions in a similar way.

Place the multiple choice questions in the same way - marks outside the bracket, number o f questions inside the-bracket.

Calculate the subtotals of all the questions under all the objectives.

Calculate the totals. Your total should tally with the weightages of objectives and content units that you had already marked on the blue print.

Fill in the summary of types of questions, Scheme of Sections and Scheme of Options.

Prepare questions based on the blue print Guideline for Setting a Good Question Paper

After the blue print is ready, questions are to be prepared according to the dimensions defined in the blueprint. For example, if there are essay type questions to be prepared to test the writing skills, one letter and one report and also a short answer question on writing a notice, the test constructor should prepare these three questions along with their options which may be comparable in terms of objectives to be tested, content areas, forms of questions and the difficulty level. .

While preparing questions it must be kept in mind that the question:

- is based on the specific objective of teaching as indicated in the blue print

-relates t o the specific content area as per the blue print

- is written in the form as required by the blue print and satisfies all the rules for framing that form of questions

-is at the desired level of difficulty

- is written in clear, correct and precise language which is well within the comprehension of pupils

- clearly indicates the scope and length of the answer.

Another thing to be kept in view while writing questions is to prepare the answers simultaneously because quite often the answers help in refining the questions.

4. Assembling the Question Paper

After the questions are prepared, they are to be assembled in a question paper form, For this, instructions are to be written. General instructions for the paper may be given on top whereas instructions for specific questions may be given just before the questions.

The order of questions is also to be decided while assembling the question paper Sometimes it is according to the forms of questions, i.e., objective type questions may be put first, then very short answer, short answer and essay type questions or it may be according to the content as in the case of a language question paper where we may have structure questions first, then questions on unseen passage and then composition questions.

The assembling and editing of the question paper is important from the point of view of administration. For example, if the question is divided into two sections, one of which is t o be collected within a specific time limit, clear instructions to do so should be mentioned and also the arrangement of questions should be such that both the sections are easily demarcated.

5. Preparing the Scoring Key and the Marking Scheme

Scoring key is to be prepared for objective 'type questions and the marking scheme for other questions.

The scoring key gives the alphabet of the correct answer and the marks carried by each question. The marking scheme gives the expected outline answer and the value points for each aspect of the answer.

Detailed instructions for marking are also worked out, e.g., in marking compositions, etc. It is specified as to how many marks are to be deducted for spelling mistakes or structural mistakes, or if the composition is to be graded, how it is to be done and on what basis.

The detailed marking scheme is necessary to ensure consistency and uniformity in scoring by different examiners. In other words it ensures reliability of scoring.

6. Preparing Question-wise Analysis

After the question paper and marking scheme are finished, it is desirable to prepare a question-wise analysis. This analysis helps in tallying the questions in the test with the blue print, It also enables us to know the strengths and weaknesses o f the test better, eg., through-the analysis we can know how many topics have been covered in the syllabus, what is the difficulty level of each question and what specifications are being tested by each question.

The analysis is done on following points:

1. Number of the question.

2. Objective tested by the question.

3. Specification on which the question is based.

4. Topic covered.

5. Form of the question.

6. Marks allotted.

7. Approximate time required for answering.

8. Estimated difficulty level.

DIAGNOSTIC TEACHING

GRAPHICAL REPRESENTATION OF DATA

The statistical data may be presented in a more attractive form appealing to the eye with the help of some graphic aids, i.e. Pictures and graphs. Such presentation carries a lot of communication power. A mere glimpse of thee picture and graphs may enable the viewer to have an immediate and meaningful grasp of the large amount of data.

Ungrouped data may be represented through a bar diagram, pie diagram, pictograph and line graph.

· Bar graph represents the data on the graph paper in the form of vertical or horizontal bars.

Graphical representation of data helps in faster and easier interpretation of data.
A bar graph uses bars or rectangles of the same width but different heights to represent different values of data.

In a bar graph:

The bars have equal gaps between them.
The width of the bars does not matter.
The height of the bars represents the different values of the variable.

In a pie diagram, the data is represented by a circle of 360degrees into parts, each representing the amount of data converted into angles. The total frequency value is equated to 360 degrees and then the angle corresponding to component parts are calculated.

Pie charts are useful to compare different parts of a whole amount. They are often used to present financial information. E.g. A company's expenditure can be shown to be the sum of its parts including different expense categories such as salaries, borrowing interest, taxation and general running costs (i.e. rent, electricity, heating etc).

A pie chart is a circular chart in which the circle is divided into sectors. Each sector visually represents an item in a data set to match the amount of the item as a percentage or fraction of the total data set.

A family's weekly expenditure on its house mortgage, food and fuel is as follows:

Draw a pie chart to display the information.

Solution:

We can find what percentage of the total expenditure each item equals.

Percentage of weekly expenditure on:

To draw a pie chart, divide the circle into 100 percentage parts. Then allocate the number of percentage parts required for each item.

Note:

It is simple to read a pie chart. Just look at the required sector representing an item (or category) and read off the value. For example, the weekly expenditure of the family on food is 37.5% of the total expenditure measured.
A pie chart is used to compare the different parts that make up a whole amount.

In pictograms, the data is represented by means of picture figures appropriately designed in proportion to the numerical data.

· Line graphs represent the data concerning one variable on the horizontal and other variable on the vertical axis of the graph paper.

A line graph is often used to represent a set of data values in which a quantity varies with time. These graphs are useful for finding trends. That is, finding a general pattern in data sets including temperature, sales, employment, company profit or cost over a period of time.

A cylinder of liquid was heated. Its temperature was recorded at ten-minute intervals as shown in the following table.

a. Draw a line graph to represent this information.
b. Estimate the temperature of the cylinder after 25 minutes of heating.

Solution:

Grouped data may be represented graphically by histogram, frequency polygon, cumulative frequency graph and cumulative frequency percentage curve or ogive.

· A histogram is essentially a bar graph of a frequency distribution. The actual class limits plotted on the x-axis represents the width of various bars and respective frequencies of these class intervals represent the height of these bars.

· A frequency polygon is a line graph for the graphical representation of frequency distribution.

· A cumulative frequency graph represents the cumulative frequency distribution by plotting actual upper limits of the class intervals on the x axis and the respective cumulative frequencies of these class intervals on the y axis.

· Cumulative frequency percentage curve or ogive represents cumulative percentage frequency distribution by plotting upper limits of the class intervals on the x axis and the respective cumulative percentage frequencies of these class intervals on the y axis.

METHOD FOR CONSTRUCTING

A HISTOGRAM

1. The scores in the form of actual class limits as 19.5-24.5, 24.5-29.5 and so on are taken as examples in the construction of a histogram rather than written class limits as 20-24, 25-30.

2. It is customary to take two extra intervals of classes one below and above the grouped intervals.

3. Now we take the actual lower limits of all the class intervals and try to plot them on the x axis. The lower limit of the lowest class interval is taken at the intersecting point of x axis and y axis.

4. Frequencies of the distribution are plotted on the y axis.

5. Each class interval with its specific frequency is represented by separate rectangle. The base of each rectangle is the width of the class interval. And the height is representative of the frequency of that class or interval.

6. Care should be taken to select the appropriate units of representation along the x and y axis. Both the axis and the y axis must not be too short or too long.

In a histogram:

The bars do not have gaps between them.
The width of the bars is proportional to the class intervals of data.
The height of the bars represents the different values of the variable.
The area of each rectangle is proportional to its corresponding frequency.

The area of a histogram is equal to the area enclosed by its corresponding frequency polygon.

METHOD FOR CONSTRUTING

A FREQUENCY POLYGON

1. As in histogram two extra class interval is taken, one above and other below the given class interval.

2. The mid-points of the class interval is calculated.

3. The mid point is calculated along the x axis and the corresponding frequencies are plotted along the y axis.

4. The various points given by the plotting are joined by lines to give frequency polygon.

Midpoints of the interval of corresponding rectangle in a histogram are joined together by straight lines. It gives a polygon i.e. a figure with many angles. it is used when two or more sets of data are to be illustrated on the same diagram such as death rates in smokers and non smokers, birth and death rates of a population etc

One way to form a frequency polygon is to connect the midpoints at the top of the bars of a histogram with line segments (or a smooth curve). Of course the midpoints themselves could easily be plotted without the histogram and be joined by line segments. Sometimes it is beneficial to show the histogram and frequency polygon together.

Unlike histograms, frequency polygons can be superimposed so as to compare several frequency distributions.

This is a histogram with an overlaid frequency polygon.

Midpoints of the interval of corresponding rectangle in a histogram are joined

DIFFERENCE BETWEEN HISTOGRAM AND FRQUENCY POLYGON

Histogram is a bar graph while frequency polygon is a line graph. Frequency polygon is more useful and practical. In frequency polygon it is easy to know the trends of the distribution; we are unable to do so in histogram. Histogram gives a very clear and accurate picture of the relative proportion of the frequency from interval to interval.

METHOD FOR CONSTRUTING

A CUMULATIVE FREQUENCY GRAPH

1. First of all we calculate the actual upper and lower limits of the class intervals i.e. if the class interval is 20-24 then upper limit is 24.5 and the lower limit is 19.5.

2. We must know select a suitable scale as per the range of the class interval and plot the actual upper limits on the x axis and the respective cumulative frequency on y axis.

3. All the plotted points are then joined by successive straight lines resulting a line graph.

4. To plot the origin of the x axis an extra class interval is taken with cumulative frequency zero is taken.

Physical Science

Wednesday, 28 March 2012

Unit-9 Evaluation

1 comment:

About Me