Achievement test
A standardized test used to measure acquired learning, in a
specific subject area such as reading or arithmetic, in contrast to an
intelligence test, which measures potential ability or learning capacity.
Achievement
test is an examination administered to determine how much a person has learned,
or how much knowledge a person has or acquired. (noun)
An example of an
achievement test is the Regent's exam that students must take to prove they
have learned math and other academic lessons.
Standardized test
• Diagnostic Tests for Preschoolers -After a
screening, tests for diagnostic assessment are administered (if needed).
• Adaptive behavior measures assess possible
learning, social or motor disabilities.
• Intelligence tests measure learning potential.
• Achievement Tests Determine Instructional
Effectiveness
National tests:
• compare student achievement across states to
address higher standards for education
• identify poor instructional areas, pinpoint
weaknesses in a state’s instructional program and facilitate improvements
State developed tests are used at school districts
to:
determine each student’s progress
provide diagnostic
information on a child’s needs for future instruction
describe student progress
between and within schools
Standardized Tests and Teaching
• The Nature of Standardized Tests
• Steps in Standardized Test Design
The following steps ensure that the test achieves
its goals
and purposes:
· specify the purpose
· determine the format
· formulate objectives
· test construction: write, try out, and analyze
items
· assemble the final form
· administer the final test form
· establish norms, determine the validity and
reliability
· develop a test manual
Steps in Standardized Test Design: Specifying the
Purpose
A
clearly defined purpose is the framework for the construction of the test.
• It allows evaluation of the instrument when
design and construction steps are completed.
• It helps explain what the test will measure,
how the test results will be used, and who will take the test.
• It describes the population for whom the test
is intended.
• Steps in Standardized Test Design: Determining
Test Format
Format decisions are based on the purpose of the
test and the characteristics of the test takers:
• how test items will be presented and how the
test taker will respond
(e.g., tests designed for very young children are usually presented orally;
paper and pencil tests for older students)
• given as a group test or as an individual test
• Steps in Standardized Test Design: Test
Construction
The test’s purpose guides:
• defining test objectives
• writing test items for each objective
• assembling experimental test forms
• Steps in Standardized Test Design: Developing
Experimental Forms
For a school achievement test:
• test content is delimited
• curriculum is analyzed to ensure that the test
will reflect the instructional programs
• teachers and curriculum experts review content
outlines and objectives for the test; and later they review test items
• writing, editing, trying out, and rewriting or
revising test items
• a preliminary test with selected test items is
assembled for trial with a sample of students
• Steps in Standardized Test Design: Developing
Experimental Forms
• Experimental test forms resemble the final
form
• Instructions are written for test
administration
• The sample of people who to take the
preliminary test is similar to the population that will take the final form of
the test
• Steps in Standardized Test Design: Item
Analysis in The Test Tryout Phase
Study each item’s:
• Difficulty level: how many test takers in the
trial group answered the question correctly
• Discrimination: the extent to which the
question distinguishes between test takers who did well or poorly; test takers
who did well on the test should be more successful on the item than those who
did poorly
• Grade progression in difficulty: for tests
that are taken in different grades, in each successively higher grade, a
greater percentage of students should answer it correctly
• Steps in Standardized Test Design: The Final
Test Form Is Assembled
• In item analysis, test items were revised, or
eliminated
• Test items that measure each test objective
are selected for the test
• Alternative forms of one test must ensure that
each of the forms are equivalent in content and difficulty
Test directions are finalized-- with
instructions for test administrators about the testing environment and testing
procedures; and instructions for test takers
• Steps in Standardized Test Design:
Standardizing the Test
The final test form is administered to another,
larger sample of test takers to acquire norm data.
Norms allow for comparisons of children’s
test performance with the performance of a reference or norming group.
• The norming group is chosen to reflect the
makeup of the population for whom the test is designed.
Evaluating Standardized Tests
Reliability – Are test scores stable, dependable
and relatively free from error?
Validity – Does the test measure what it is
supposed to measure?
Correlation
Correlation Coefficient is a statistical measure of
relationship between two variables.
• Pearson correlation coefficient
• r = the Pearson coefficient
• r measures the amount that the two variables
(X and Y) vary together (i.e., covary) taking into account how much they vary
apart
• Pearson’s r is the most common correlation
coefficient; there are others.
• Computing the Pearson correlation coefficient
• To put it another way:
Sum of Products of Deviations
• Measuring X and Y individually (the
denominator):
– compute the sums of squares for each variable
• Measuring X and Y together: Sum of Products
– Definitional formula
– Computational formula
• n is the number of (X, Y) pairs
Correlation Coefficent:
• the equation for Pearson’s r:
• expanded form:
Example
• What is the correlation between study time and
test score:
• Calculating values to find the SS and SP:
• Calculating SS and SP
• Calculating r
• Correlation Coefficient Interpretation
Reliability
• Test-retest: The extent to which a test yields
the same score when given to a student on two different occasions
• Alternate-forms: Two different forms of the
same test on two different occasions to determine the consistency of the scores
• Split-half: Divide the test items into two
halves; scores are compared to determine test score consistency
• Standard Error of Measurement
• an estimate of the amount of variation to be
expected in test scores.
• If the reliability correlations are poor, the
standard error of measurement will be large.
• The larger the standard error of measurement,
the less reliable the test.
• Variables That Affect the Standard Error of
Measurement
The following affect test reliability:
• Population sample size --the larger the
population sample, the more reliable the test
• Test length --longer tests are usually more
reliable because there are more test items, resulting in a better sample of
behaviors
• Range of test scores of the forming group
--the wider the spread of scores, the more reliably the test can distinguish
between good and poor students
• Types of Validity…
• Content: Test’s ability to sample the
content that is being measured
• Criterion-related:
1. Concurrent: The relation between a test’s
score and other available criteria
2. Predictive: The relationship between test’s
score and future performance
• Construct: The extent to which there is
evidence that a test measures a particular construct
• Considerations in Choosing and Evaluating
Tests
To select the best test to meet the developmental
characteristics of young children, the following need to be considered:
• the purpose of the testing
• the characteristics to be measured
• how the test results will be used
• the qualifications of those who will interpret
the scores and use the results
• any practical constraints: cost, time, ease of
scoring and use of test results
Reviewing the Test Manual
The test manual should include information that is adequate for users to
determine whether the test is practical and suitable for their purposes.
The manual should address the following:
• Purpose of the test
• Test design
• Establishment of validity and reliability
• Test administration and scoring
DIAGNOSTIC TEACHING
Diagnostic teaching is the “process of
diagnosing student abilities, needs and objectives and prescribing requisite
learning activities”. Through diagnostic teaching, the teacher
monitors the understanding and performance of students before teaching the lesson,
while teaching, and after teaching the lesson. Diagnostic teaching can inform
teachers of the effectiveness of their lessons with individuals, small groups
of students, or whole classes, depending on the instruments used. Within a
diagnostic teaching perspective, assessment and instruction are interacting and
continuous processes, with assessment providing feedback to the teacher on the
efficacy of prior instruction, and new instruction building on the learning
that students demonstrate.
Teachers may evaluate
student learning on the spot, or collect data at different points in time and
compare progress over units of instruction. Moment-by-moment assessments allow
teachers to tap into students’ developing understandings about reading and
students’ use of strategic processing to understand and remember text, and
enable teachers to correct misconceptions immediately. Observations recorded
over time allow teachers to identify patterns of development and document
learning gains. Both “on-the-run” assessments and systematic records of
teachers’ observations of students’ learning over time can supplement the more
quantitative and summative assessments that the ministry or school mandates and
are more likely than end-of-term assessments to develop teachers’ capacity to
improve the quality and appropriateness of instruction.
Diagnostic assessments are
themselves educative for teachers. By introducing the concept of
diagnostic teaching and the monitoring techniques to support such instruction,
teachers will be better able to recognize reading as a developmental process
and target instruction to meet the needs of individuals and groups. As
students progress toward reading proficiency, they gain control over different
components of the reading process. Yet, not all students will be at the
same level of proficiency or need the same instruction. Students progress
through overlapping stages in a developmental sequence that leads to
proficiency in reading. Starting with “visual-cue” word recognition, wherein
students memorize the configuration of words, to increasing awareness of
phonology and the way sounds map onto letters in an alphabetic language,
students gradually consolidate their use of larger letter patterns to recognize
words effortlessly and automatically. At the “automatic word recognition”
stage, students are able to orally read text fluently, with speed and
prosody. As students become fluent readers, they are likely to devote
more attention to comprehension, routinely using background knowledge and strategic
processing to understand and remember text. By the time students reach
“reading proficiency,” their reading comprehension equals or surpasses what
they can glean from listening to lecture presentations. Nonetheless, not all students’ progress through these
stages at the same rate, and in any given classroom, there will be students who
need different kinds of support from their teachers. For example, some students
may be able to decode words but only slowly and with great effort. Others
may be fluent word callers but lack vocabulary and the ability to read
strategically for comprehension. Thus, students’ vocabulary, background
knowledge, fluency, interest and motivation, as well as the ability to
accurately identify words, all influence their reading comprehension.
Through professional
development, teachers will be able to recognize the importance of the various
components of the reading process and identify and use assessment and
instruction to support the development of these components. Teachers who
participate in the diagnostic teaching workshops will have a more elaborated
view of the reading process, beyond students’ ability to decode words and
memorize text. In particular, teachers will recognize the importance of automatic
word recognition, that is, the ability to read high frequency words and
phonetically regular words accurately and fluently, and the importance of
strategic reading, that is, the ability to make inferences and monitor and
repair understanding when reading different genres of text.
The diagnostic teaching
techniques presented in the workshops cover the entire continua of reading
development. Teachers will be able to demonstrate how to assess and support
students’ emerging reading behaviors, such as concepts about print and basic
decoding, among the youngest or least experienced readers. By using
a fluency rating rubric, such as that developed for the National Assessment of
Educational Progress, and determining students’ words correct per minute on an
oral reading of a short passage, teachers also will be able to identify more
experienced students’ fluency and put into practice activities to support
automatic word recognition, such as dramatic readings or rereadings of
text. Teachers will demonstrate how to assess and instruct students in
comprehension strategies before, during and after reading. Before
reading, teachers will assess whether students can preview, set a purpose for
reading, and bring prior knowledge to bear on the topic of the reading.
During reading, teachers will assess students’ ability to develop predictions
and questions, and monitor text understanding. After reading, teachers
will assess whether students can remember or summarize what they read.
Teachers will learn as well to analyze and interpret non-traditional ways of
processing and responding to text, such as verbal protocols and drawings that
make visible the thinking of students for teachers to evaluate.
As teachers learn to
administer and interpret these diagnostic assessments, they themselves will
develop a more elaborated understanding of the reading process. In order
to assess comprehension of narrative text and the adequacy of students’
retellings, for example, teachers must themselves learn to identify the story
elements of character, plot, setting, problem and resolution. Likewise,
in order to evaluate whether students are able to navigate nonfiction texts,
teachers first must be able to recognize the organizational patterns in
particular informational texts, for example, cause and effect, or comparison
and contrast, and then identify and teach the comprehension strategies that
students are not using.
Achievement tests are
exams that are designed to determine the degree of knowledge and
proficiency exhibited by an individual in a specific area or set of areas.
An achievement test is sometimes administered as part of the
acceptance process into an educational program
In other applications,
the achievement test serves as a tool to measure current knowledge
levels for the purpose of placing students in an educational environment where
they have the chance to advance at a pace that is suitable for their abilities.
Achievement is testing to
identify students who are prepared to move on to more advanced courses of study
or who need some type of remedial instruction. The idea behind using
an achievement test format to measure the grade level for each
student is not intended to reflect on the general intelligence of the
individual. Rather, the purpose of the testing is to ensure each student is
placed in a classroom situation where there is the best opportunity for learn
and assimilate data in an organized fashion that prepares them for moving on to
more advanced material.
For example, a student who does
not do well with basic mathematics on an achievementtest is likely to be
placed in a remedial learning situation. Doing so provides the student with the
opportunity to master the basics before attempting to learn more advanced
mathematical concepts like algebra or geometry. At a later date, the student
will have the chance to take a second test; should the results indicate that
the student is sufficiently prepared to move on to something more complicated,
he or she can be reassigned to a more challenging course of study.
Individuals and groups that oppose the use of
an achievement test claim that the exams are not structured in a
manner that accounts for the general aptitude of each student, resulting in the
creation of an overall learning environment that pigeonholes rather than nurtures
each student and promote productive learning.
It is not unusual for many school jurisdictions to make use of a
sample achievement test several weeks before administering the live
exam. The idea behind the achievement test practice run is to allow
students to get an idea of the general format of the exam and what type of
instructions apply to each section. While the specific questions used in the
sample are different those utilized in the live achievement test,
they are usually close enough to provide the student with an idea of what to
expect.
QUALITIES OF A GOOD TEST
A
good test should possess the following qualities.
• Objectivity
• Objective Basedness
• Comprehensiveness
• Validity
• Reliability
• Practicability
• Comparability
• Utility
Objectivity
• A test is said to be objective if it is free from personal biases in interpreting its scope as well as in scoring the responses.
• Objectivity of a test can be increased by using more objective type test items and the answers are scored according to model answers provided.
Objective Basedness
• The test should be based on pre-determined objectives.
• The test setter should have definite idea about the objective behind each item.
Comprehensiveness
• The test should cover the whole syllabus.
• Due importance should be given all the relevant learning materials.
• Test should be cover all the anticipated objectives.
Validity
• A said to be valid if it measures what it intends to measure.
• There are different types of validity:
– Operational validity
– Predictive validity
– Content validity
– Construct validity
• Operational Validity
– A test will have operational validity if the tasks required by the test are sufficient to evaluate the definite activities or qualities.
• Predictive Validity
– A test has predictive validity if scores on it predict future performance
• Content Validity
– If the items in the test constitute a representative sample of the total course content to be tested, the test can be said to have content validity.
• Construct Validity
– Construct validity involves explaining the test scores psychologically. A test is interpreted in terms of numerous research findings.
Reliability
• Reliability of a test refers to the degree of consistency with which it measures what it indented to measure.
• A test may be reliable but need not be valid. This is because it may yield consistent scores, but these scores need not be representing what exactly we want to measure.
• A test with high validity has to be reliable also. (the scores will be consistent in both cases)
• Valid test is also a reliable test, but a reliable test may not be a valid one
Different method for determining Reliability
• Test-retest method
– A test is administrated to the same group with short interval. The scores are tabulated and correlation is calculated. The higher the correlation, the more the reliability.
• Split-half method
– The scores of the odd and even items are taken and the correlation between the two sets of scores determined.
• Parallel form method
– Reliability is determined using two equivalent forms of the same test content.
– These prepared tests are administrated to the same group one after the other.
– The test forms should be identical with respect to the number of items, content, difficult level etc.
– Determining the correlation between the two sets of scores obtained by the group in the two tests.
– If higher the correlation, the more the reliability.
Discriminating Power
• Discriminating power of the test is its power to discriminate between the upper and lower groups who took the test.
• The test should contain different difficulty level of questions.
Practicability
• Practicability of the test depends up on...
• Administrative ease
• Scoring ease
• Interpretative ease
• Economy
Comparability
• A test possesses comparability when scores resulting from its use can be interpreted in terms of a common base that has a natural or accepted meanings
• There are two method for establishing comparability
– Availability of equivalent (parallel) form of test
– Availability of adequate norms
Utility
• A test has utility if it provides the test condition that would facilitate realization of the purpose for which it is mean.
Evaluating science
teaching and learning
“True genius resides in the capacity for evaluation of uncertain, hazardous,
and
conflicting information “ ~
Winston Churchill (1874-1965)
WHAT IS EVALUATION?
Evaluation in education is a reflective
judgement on teaching practices, students’ learning and the learning
environment. It is an on-going process that enables the teacher to reflect on
how well students are learning.
Evaluation can be viewed as a critical analysis
towards the improvement of teaching practices. There are numerous aspects of
teaching and learning that can be evaluated. Keep in mind that effective
teachers constantly seek ways to enhance their teaching. Critically analysing
teaching and learning provides a way forward. In teaching, evaluation can be
synonymous with reflection. Evaluating teaching and learning can assist
teachers to identify ways to raise the teaching standards, and hopefully the
learning..
WHAT DO I EVALUATE?
You can evaluate anything within the teaching
and learning environment. Your planning, teaching, activities and assessments
need to be evaluated. You can evaluate your teaching approaches and strategies.
Students’ learning must be evaluated. You may want to evaluate the resources,
learning spaces, guest speakers, worksheets, and anything else that you think
may make a difference to either your teaching or the students’ learning. Basic
evaluation will include what worked, what didn’t work, and what needs to be
considered for improving future teaching practices.
Evaluating Teaching Plans and Practices
1. Plans
• Was sound lesson planning evident?
• Was the lesson outcome(s) stated?
• Were links to the syllabus outlined?
• Was the lesson appropriately timetabled and
timed?
• Was the prior knowledge of the students
considered?
• Were teaching strategies outlined?
• Was content knowledge evident?
• Were resources prepared?
• Were classroom management strategies apparent?
• Were student activities suitable and
appropriate?
• Were methods of assessing students’ learning
outlined?
2. Practices
• Was I confident in teaching this lesson?
• Was I enthusiastic about teaching science?
• Did I stimulate students’ interests in the
topic?
• Had I presented a well-designed lesson?
• Were my explanations clear and succinct?
• Did I ask a range of lower and higher-order
questions?
• Had I catered for the range of student
abilities?
• Did I hold the students’ attention when
teaching?
• Had I developed a good rapport with students?
• Did I use effective classroom management
strategies?
• Were consequences and rewards appropriate?
• Did I monitor students’ work?
• Had I displayed adequate science content
knowledge?
• Did I use appropriate terminology from the
science syllabus?
• Had I used sufficient hands-on materials?
• Did I allow students to record and communicate
their new knowledge?
• Had I concluded the lesson with key scientific
concepts?
• Was there a real-world connection to the
lesson?
Evaluation needs to occur frequently. Every
lesson should be evaluated in terms of what worked, what didn’t work, and
what could be improved for future practices. Experienced teachers will evaluate
as they proceed through a lesson and make adjustments accordingly. A
teaching plan is a guide only. It’s a vision of what could occur. When a
lesson is in action, with the varying personalities and abilities of students,
a plan may need to be changed. You will need to adapt to particular
circumstances. This adaptation can also be part of your evaluation.
This will assist you to understand a learning
environment. It will help you to devise other strategies for teaching in a
similar situation. The unexpected behaviour for a beginning teacher can very
well be expected by a more experienced teacher. The experienced teacher
knows what to expect because of continual evaluations of circumstances;
critical self reflections.
WHO CAN EVALUATE?
The two key players who need to evaluate the
teaching and learning are the students and the teacher. Other potential
evaluators can include the principal, executive staff, parents and other
interested groups. Evaluation should be viewed as a positive action. The
outcome of evaluation should be improved teaching and learning. So the
immediate stakeholders (you and your students) must have a say in how students
learn.
Teacher
The classroom teacher should always evaluate
lessons and units. Teachers can focus their attention on themselves in
evaluations. A teacher needs to be honest in an evaluation of teaching and
learning. Pretending a problem doesn’t exist won’t make it go away.
Despite careful planning, not every lesson will be successful as there are many
things that could alter the direction of the lesson. Maybe there weren’t enough
resources or that student interaction wasn’t as anticipated. Regardless,
a teacher learns from experiences and at the heart of this learning is
critical and open reflection. Highlighting what else may be required to
make the unit of work more successful and pinpointing areas that were
successful will assist you to teach future classes more effectively. It
is up to the teacher to decide what to include in an evaluation.
Example provides an evaluation criteria for a
unit based around energy.
The teacher has decided to evaluate the
students’ learning of the key concepts and their engagement in the unit.
Although this evaluation is quite broad, responses to each criterion may
provide insight towards enhancing the unit for future classes.
Student
It is useful to have students evaluate the
teaching and learning environment, particularly as they are the focus of
the attention. Evaluation from students needs to be age appropriate.
Those in the early primary grades may have the evaluation read out to
them, and they would record their responses by colouring in a face or a
sign. They may be asked to circle pictures that represent lessons
that they liked. They may be asked to draw a picture of a lesson they
enjoyed. Example 11.4 shows a range of evaluation forms that could be used at
an early primary school level.
Student written evaluation
1. Did you enjoy our unit of work about
Fossils? YES/NO. Why or Why not?
2. What was your favourite part of the unit?
3. What was your least favourite part of the
unit?
4. Did you find anything really difficult or
really easy in the unit? Please explain here.
5. If you could do the unit again, what else
would you be interested in learning?
6. What did you learn during the unit?
7. Reflect on the unit, how did you feel at the
start of the unit when you knew what you were going to be studying, and how do
you now feel at the end given what you have learnt?
Evaluative questions
for teachers and students
1. Were the overall standards of the unit
achieved?
2. Did students engage effectively with the unit
topic?
3. Did students develop an understanding of the
key concepts being taught?
4. Were students able to achieve the learning
outcomes in all lessons?
5. Was the duration of the unit: too long/too
short/just right?
6. Were the needs of students of varying
learning abilities accounted for?
7. Did you feel that you had researched the
topic sufficiently?
8. Did my content knowledge enable me to answer
student questions appropriately?
9. What would you change in this unit for
teaching in the future?
10. How well did the students achieve the
outcomes outlined in the unit?
11. Did the scaffolding of the lessons achieve
maximum learning from the students?
12. Did the lessons flow from one to the next,
at a pace that students could follow?
13. Did the activities in each lesson assist
students’ awareness of the topic?
14. Were the activities and experiments in each
lesson appropriate for this year level?
15. Were the resources appropriate for the unit
of work?
16. How well was the integration of other KLAs
used within the unit?
17. Were the students receptive to combining the
content from this unit, with other KLA’s such as mathematics or drama?
18. How effective were the teaching strategies?
19. What teaching strategies worked and why?
20. How did the students respond to the learning
approaches?
21. Which learning approaches worked best? How
and Why?
22. Which learning approaches worked least?
Why? How could they be developed in the future?
23. Were the discussions and conversations
substantive, meaningful, and promote conceptual knowledge?
24. What was the level of student engagement in
the lessons?
25. In what areas was the lesson problematic and
why?
26. How has this lesson informed my approach to
teaching science?
27. What will I do differently in the future?
28. How did I feel when students were engaged
and interested in the science topic?
29. Were the questioning techniques used in the
lessons effective and developed higher-order thinking?
30. Did the activities provide opportunities for
the students to demonstrate core learning outcomes?
31. How was the overall flow of the unit?
32. Was the interactive approach to teaching
effective and promoted more student engagement?
33. Were the assessment practices fair, varied,
and link directly to the syllabus outcomes?
34. Were there some difficulties through the
unit development? How were these problems and issues addressed?
35. Did the teaching strategies (e.g. group
work, portfolios, Bybee’s 5Es) help achieve at the highest level?
36. What teaching strategies were effective,
what is the evidence?
37. What did I learn by teaching this unit?
38. What did I learn about myself as a teacher
by teaching this unit?
39. How has teaching this unit changed my
teaching pedagogy?
40. What teaching approaches and strategies were
effective, what is the evidence?
41. Did the outcome statements reflect the
targeted learning outcomes?
42. Were the indicators suitable and did they
reflect the targeted learning outcomes and the key concepts?
43. What teaching approaches and strategies were
not effective? Why? How can I change this next time?
44. How did I manage student behaviour?
45. In the future, how could I approach teaching
this science topic differently?
Teacher self
evaluation
S.agree
|
agree
|
undecided
|
Disagree
|
St.disagree
|
PLANNING OF UNIT
The topic of the unit was relevant to the
students
Lessons were sequenced to allow students to
learn most effectively.
Each lesson provided at least one key concept
for students to learn.
The experiments and activities were relevant and
engaging for students.
The worksheets were well structured and
encouraged higher-order thinking.
Activities allowed for all students to
participate and catered for all abilities.
Each lesson had an assessment component.
TEACHING OF UNIT
I included all students at all times
I ensured all students were engaged with the
topic.
I asked questions that encouraged higher-order
thinking.
I facilitated meaningful discussions.
I encouraged hands-on learning wherever
possible.
I appeared enthusiastic at all times.
I provided the students with the content
knowledge to complete activities.
I assessed all students fairly with a range of
different activities throughout unit.
27.2 ATTRIBUTES OF A GOOD TEST
The three main attributes of a good test are
validity, reliability and usability.
Validity
Va1idity :- It is the most important
characteristic of a good test. Validity of a test is the
extent to which it measures what it attempts to
measure. That is to say that a test should conform to the objectives
of testing. For example, in an English Language Test
where the purpose of testing is to measure the students' ability
to manipulate language structures, the following test item will not be
valid:
Name the part of speech of each
underlined word and also state its kind : His old typewriter should
be in a museum.
This item is invalid because it is testing the
students' knowledge about the language and not hislher ability to manipulate
structures.
Though there are many types of validity, for a
classrooln teacher it is enough to know , about the following three
types:
1. Face Validity
Face validity means that a test, even on a
simple inspection, should look valid. For example in a
language test the following item is totally invalid, as it does not
test language but the computation skill of the students.
The train starts from New
Delhi at 8:10 hours and reaches Kanpur at
14:30 hours.
How much time does the train take to reach
Kanpur from New Delhi?
To establish face validity of a test, the
examiner or the teacher should go through the test and examine its content
carefully.
2. Content Validity
Content validity is very important in an
achievement test as an achievement test tries to measure some specihc skills or
abilities through some specific content. To obtain content validity it is
necessary thqt all the important areas of the course content are represented in
the test and also.that the test covers all the instructional objectives. In
other words, the test should contain questions on every important area of the
content in appropriate proportion and the questions should be
framed in such a way that all
the objectives of that course, are tested
properly. Content validity can also be ensured by analyzing the
course content and the instructional objectives
to be achieved through it and by taking the test items
to both these things.
3. Empirical Validity
Empirical validity is also known as statistical
validity or criterion-related validity as to ensure this a criterion is taken
(which may be a standardized test;ano,ther teacher's ratings on a class
test, students' scores on a previous test, or the sh~dents' grades on
subsequent fmal examination, etc.) and the scores of students are correlated
with their scores on the criterion test. If the scores correlate
positively the test may be said to have empirical validity. Empirical
validity is important because it shows statistically that a test is valid, i.e,
, it measures well what it intends to measure. ,
Reliability
It refers to the consistency with which a
question paper measures the achievement of students. In other
words, if the test is to be reliable the chance
errors must be zero.
Unreliability occurs at two stages:
1. Firstly, at the level
o f examinee, when s h e is not able
to understand and interpret the question properly. This may
be due to vagueness in language of question or due to some other
reason. This can be removed if the questions are
pointed and free from ambiguity.
2. Secondly, at the
level of examiners. In the absence o f
standard marking scheme, examiners are free to interpret and mark the
questions in their own way. This contributes greatly
to unreliability. A detailed marking scheme improves
the reliability aspects of the question paper. Objective
type and
very short answer type questions are more
reliable than essay-type questions.
Thus, by including these questions and also
by increasing the total number of questions in a question
paper reliability can be increased.
Usability
Usability or practicability is
the third characteristic of a good test.
There are a number of practical factors that are to be
considered while preparing or selecting a test for use. ,
The first thing to be kept in mind is that
the test should be of such a length that is can be administered within
stipulated time. If the test is too long or too short
it may not be practical to use as a classroom test.
Secondly, it is to be seen that the
test is easy to administer and that clear cut directions are
provided in the test so that the testees as well as the
test administrators can perform their tasks with
efficiency.
Mo'reover, the facilities available for administration
should also be kept in view as in case of oral tests,
tape recorders [nay be required.
If a teacher doesn't have the facility of
tape recorder, s/he should not take up a test requiring the use of one.
Thirdly, scorability is also to be considered
while using a test. When large number of students are
involved, a test which can be scored quickly
(preferably by machine) is to be selected but when only one - class is to
be tested, perhaps a test consisting of subjective questions may also be
used.
27.3 STEPS OF TEST CONSTRUCTION -
Once the teacher or the test constructor is
aware of the characteristics that a good test must possess, s/he dan proceed to
construct a test, which may be either a unit test or a full-fledged question
paper covering all the aspects of the syllabus. Whether the test is
a unit test for use in classroom testing or
a question paper for use in final
examinations, the steps of test construction are the same, which are as
follows:
1 , Prepare a Design
The first step in preparing a
test is to construct a design. A test is
not merely a collection of assorted questions, To be of any
effective use, it has to be planned in advance keeping in view the objectives
and the content of the course and the forms of questions to be
used for testing these. For this weightage to
different objectives, different areas of content, and different forms of
questions are to be decided, along with the scheme of options and
sections, and these are the dimensions which 'are known as a design-!of
a, test.
a. Weightage to Objectives
To make a test valid, it is necessary to analyze
the objectives of the course and decide which objectives are toabe tested
and in what properties. For this marks are allotted to each objective t
o be tested according to its importance. 'In English language testing the
three major objectives are knowledge of the elements of language, comprehension
and expression. The weightages to all these
three objectives may be decided in percentages, For
example for a test of 50 marks the following waightages may be decided.
b. Weightage to different areas of Content
1 It is necessary to
analyze the syllabus and allot weightages to different
areas of
Total content. This is again done to
endure the validity of the test. A hypothetical example
is given below for an English language test
showing weightages to content units for a
class XI test.
After analyzing the objectives and. the content,
it is to be seen how they are to be tested. A
particular objective and content can be tested
more appropriately by a particular form of questio~is.
So, different f o r m of questions are to be included
in the test for testing different objectives and contents.
For this a number of different types of questions to be
included in the test and tlie marks carried by each of them are
decided. This takes care o f t h e reliability of test.
As an illustration, hypothetical
weightage to different forms of questions in
our 50 [narks question paper for class XI is given below:
d. Scheme of Sections
The design of a question
paper may also indicate the scheme of
sections for the paper. For example, a question paper may
consist of both multiple choice questions and supply type
questions. Such a test may have two sections,
one consisting of multiple choice questions and the other
consisting of supply type questions like essay
type, s11o1-t answer and very short answer type
questions. I n case the examiner wants, the
question paper can also be divided into
sections area wise like one section for reading
comprehension, another for writing tasks, third for grammar and soon. I
f the multiple choice questions are not substantial i
n number, there is no need to
keep a separate section.
e. Scheme of Options
The design may indicate the pattern of options
i.e, the complete elimination of overall options or retention of internal
options within limits. No options are to be
provided in case of multiple choice, short answer and very
s11oi.t answer questions; for essay type questions the
teacher may like to provide internal
options. While providing options, it may be kept in mind that
the options are comparable in terms of objectives to be
tested, the form of questions and the difficulty level of the
questions. As far as possible, the major area of content should also be the
same i n the options.
While planning the paper
it should be so planned that the
difficult level of the questions varies so as to
cater to all the students of the class and also to discriminate between
high achievers and low achievers. The suggested
percentage for easy and difficult questions is 20%
whereas average questions can be 60?4. The difficulty
level of the test paper can be varied according to the level of the
students. If the class has a large number of good students,
then 25% to 30% dil'fici~lt questions can
be given
2 , Preparing a Blue Print
After deciding on the design of the test, the
blue print is prepared. The blueprint is a three dimensional chart which shows
the placement of each question in respect of the objective and the content area
that it tests. It also indicates the marks carried by each
question. It is useful to prepare a blue print so
that the test maker knows which question will test which
objective and which content unit and how many marks it would
carry. Without a blue print only the weightage are decided
for objectives, content areas and types o f
questions. The blue print concretizes the
design in operational terms and all the dimensions of a question (i.e.
its objective, its form, the content area it would cover
and the marks allotted to it) become clear to the test
maker.
There is no set procedure for
preparing a blue print. However, the following sequential
steps would help in preparing a good blue print.
Transfer the decisions regarding
weightages to objectives - Knowledge, Comprehension and Expression
on the given proforma.
Transfer the weightages already decided for
different content units. For this, list the content units under the
content areas in the column given at the left hand and the marks under the
column of total given at the right hand side.
Place the essay type questions first in the blue
print. Place them under the objectives which you want to test through these
questions. The marks of the questions may be shown in the column under
the objectives and the number of questions may be given in brackets.
If in a question, marks are to be split between
two objectives indicate it with asterisks and a dotted line as shown in the
example.
After placing the essay type questions, place
the short answer type questions under the objectives and beside the content
unit that you want to test through them.
Place the very short answer type questions in a
similar way.
Place the multiple choice questions
in the same way - marks outside the bracket, number o f
questions inside the-bracket.
Calculate the subtotals of all the questions
under all the objectives.
Calculate the totals. Your total
should tally with the weightages of objectives and content
units that you had already marked on the blue print.
Fill in the summary of types of questions,
Scheme of Sections and Scheme of Options.
Prepare questions based on the blue print
Guideline for Setting a Good Question Paper
After the blue print is ready, questions are to
be prepared according to the dimensions defined in the blueprint. For example,
if there are essay type questions to be prepared to test the writing skills,
one letter and one report and also a short answer question on writing a notice,
the test constructor should prepare these three questions along with their
options which may be comparable in terms of objectives to be tested,
content areas, forms of questions and the difficulty level. .
While preparing questions it must be kept in
mind that the question:
- is based on the specific objective of
teaching as indicated in the blue print
-relates t o the specific content area as
per the blue print
- is written in the form as
required by the blue print and satisfies all the rules for framing that
form of questions
-is at the desired level of difficulty
- is written in clear,
correct and precise language which is well
within the comprehension of pupils
- clearly indicates the scope and length of the
answer.
Another thing to be kept in view
while writing questions is to prepare the answers simultaneously because
quite often the answers help in refining the questions.
4. Assembling the Question Paper
After the questions are prepared, they are
to be assembled in a question paper form, For this,
instructions are to be written. General instructions for the
paper may be given on top whereas instructions for specific questions may
be given just before the questions.
The order of questions
is also to be decided while assembling the question paper
Sometimes it is according to the forms of questions, i.e., objective type
questions may be put first, then very short answer, short answer and
essay type questions or it may be according to the content as in the case of a
language question paper where we may have structure questions first, then
questions on unseen passage and then composition questions.
The assembling and editing of the question paper
is important from the point of view of administration. For example,
if the question is divided into two sections, one of which is t o be
collected within a specific time limit, clear instructions to do so should
be mentioned and also the arrangement of
questions should be such that both the sections are
easily demarcated.
5. Preparing the Scoring Key and the
Marking Scheme
Scoring key is to be prepared for objective
'type questions and the marking scheme for other questions.
The scoring key gives the alphabet of the
correct answer and the marks carried by each question. The marking
scheme gives the expected outline answer and the value points for each aspect
of the answer.
Detailed instructions for marking are also
worked out, e.g., in marking compositions, etc. It is specified as to how
many marks are to be deducted for spelling mistakes or structural mistakes, or
if the composition is to be graded, how it is to be done and on what
basis.
The detailed marking scheme is
necessary to ensure consistency and uniformity in scoring by different
examiners. In other words it ensures reliability of scoring.
6. Preparing Question-wise Analysis
After the question paper and marking scheme are
finished, it is desirable to prepare a question-wise analysis. This analysis
helps in tallying the questions in the test with the blue
print, It also enables us to know the strengths and
weaknesses o f the test better, eg., through-the analysis we can know how
many topics have been covered in the syllabus, what is the difficulty level of
each question and what specifications are being tested by each question.
The analysis is done on following points:
1. Number of the question.
2. Objective tested by the question.
3. Specification on which the question is based.
4. Topic covered.
5. Form of the question.
6. Marks allotted.
7. Approximate time required for answering.
8. Estimated difficulty level.
DIAGNOSTIC TEACHING
Diagnostic teaching is the “process of diagnosing student
abilities, needs and objectives and prescribing requisite learning
activities”. Through diagnostic teaching, the teacher monitors the
understanding and performance of students before teaching the lesson, while
teaching, and after teaching the lesson. Diagnostic teaching can inform
teachers of the effectiveness of their lessons with individuals, small groups
of students, or whole classes, depending on the instruments used. Within a
diagnostic teaching perspective, assessment and instruction are interacting and
continuous processes, with assessment providing feedback to the teacher on the
efficacy of prior instruction, and new instruction building on the learning
that students demonstrate.
Teachers may evaluate student learning on the spot, or collect
data at different points in time and compare progress over units of
instruction. Moment-by-moment assessments allow teachers to tap into students’
developing understandings about reading and students’ use of strategic
processing to understand and remember text, and enable teachers to correct
misconceptions immediately. Observations recorded over time allow teachers to
identify patterns of development and document learning gains. Both
“on-the-run” assessments and systematic records of teachers’ observations of
students’ learning over time can supplement the more quantitative and summative
assessments that the ministry or school mandates and are more likely than
end-of-term assessments to develop teachers’ capacity to improve the quality
and appropriateness of instruction.
Diagnostic assessments are themselves educative for
teachers. By introducing the concept of diagnostic teaching and the
monitoring techniques to support such instruction, teachers will be better able
to recognize reading as a developmental process and target instruction to meet
the needs of individuals and groups. As students progress toward reading
proficiency, they gain control over different components of the reading
process. Yet, not all students will be at the same level of proficiency
or need the same instruction. Students progress through overlapping
stages in a developmental sequence that leads to proficiency in reading.
Starting with “visual-cue” word recognition, wherein students memorize the
configuration of words, to increasing awareness of phonology and the way sounds
map onto letters in an alphabetic language, students gradually consolidate
their use of larger letter patterns to recognize words effortlessly and
automatically. At the “automatic word recognition” stage, students are
able to orally read text fluently, with speed and prosody. As students
become fluent readers, they are likely to devote more attention to
comprehension, routinely using background knowledge and strategic processing to
understand and remember text. By the time students reach “reading
proficiency,” their reading comprehension equals or surpasses what they can
glean from listening to lecture presentations. Nonetheless, not all students’ progress through these
stages at the same rate, and in any given classroom, there will be students who
need different kinds of support from their teachers. For example, some students
may be able to decode words but only slowly and with great effort. Others
may be fluent word callers but lack vocabulary and the ability to read
strategically for comprehension. Thus, students’ vocabulary, background
knowledge, fluency, interest and motivation, as well as the ability to
accurately identify words, all influence their reading comprehension.
Through professional development, teachers will be able to
recognize the importance of the various components of the reading process and
identify and use assessment and instruction to support the development of these
components. Teachers who participate in the diagnostic teaching workshops
will have a more elaborated view of the reading process, beyond students’
ability to decode words and memorize text. In particular, teachers will
recognize the importance of automatic word recognition, that is, the ability to
read high frequency words and phonetically regular words accurately and
fluently, and the importance of strategic reading, that is, the ability to make
inferences and monitor and repair understanding when reading different genres
of text.
The diagnostic teaching techniques presented in the workshops
cover the entire continua of reading development. Teachers will be able to
demonstrate how to assess and support students’ emerging reading behaviors,
such as concepts about print and basic decoding, among the youngest or least
experienced readers. By using a fluency rating rubric, such as that
developed for the National Assessment of Educational Progress, and determining
students’ words correct per minute on an oral reading of a short passage,
teachers also will be able to identify more experienced students’ fluency and
put into practice activities to support automatic word recognition, such as
dramatic readings or rereadings of text. Teachers will demonstrate how to
assess and instruct students in comprehension strategies before, during and
after reading. Before reading, teachers will assess whether students can
preview, set a purpose for reading, and bring prior knowledge to bear on the
topic of the reading. During reading, teachers will assess students’
ability to develop predictions and questions, and monitor text
understanding. After reading, teachers will assess whether students can
remember or summarize what they read. Teachers will learn as well to
analyze and interpret non-traditional ways of processing and responding to
text, such as verbal protocols and drawings that make visible the thinking of
students for teachers to evaluate.
As teachers learn to administer and interpret these diagnostic
assessments, they themselves will develop a more elaborated understanding of
the reading process. In order to assess comprehension of narrative text
and the adequacy of students’ retellings, for example, teachers must themselves
learn to identify the story elements of character, plot, setting, problem and
resolution. Likewise, in order to evaluate whether students are able to
navigate nonfiction texts, teachers first must be able to recognize the
organizational patterns in particular informational texts, for example, cause and
effect, or comparison and contrast, and then identify and teach the
comprehension strategies that students are not using.
GRAPHICAL REPRESENTATION OF DATA
The statistical data may be presented in
a more attractive form appealing to the eye with the help of some graphic aids,
i.e. Pictures and graphs. Such presentation carries a lot of communication
power. A mere glimpse of thee picture and graphs may enable the viewer to have
an immediate and meaningful grasp of the large amount of data.
Ungrouped data may be represented
through a bar diagram, pie diagram, pictograph and line graph.
· Bar graph represents the data on the graph paper in the
form of vertical or horizontal bars.
Graphical representation of data helps
in faster and easier interpretation
of data.
A bar graph uses bars or rectangles of the same width but different heights to represent different values of data.
A bar graph uses bars or rectangles of the same width but different heights to represent different values of data.
In a bar graph:
- The bars have equal gaps between
them.
- The width of the bars does not
matter.
- The height of the bars represents
the different values of the variable.
In a pie diagram, the data is represented by a circle of 360degrees into parts, each
representing the amount of data converted into angles. The total frequency
value is equated to 360 degrees and then the angle corresponding to component
parts are calculated.
Pie charts are useful to compare
different parts of a whole amount. They are often used to present
financial information. E.g. A company's expenditure can be shown to be
the sum of
its parts including different expense categories such as salaries, borrowing
interest, taxation and general running costs (i.e. rent, electricity, heating
etc).
A pie chart is a circular chart in which the circle is
divided into sectors.
Each sector visually represents an item in a data set to match the amount of
the item as a percentage or fraction of
the total data set.
A family's weekly expenditure on its house mortgage, food and fuel
is as follows:
Draw a pie chart to display the information.
Solution:
We can find what percentage of the total expenditure each item
equals.
Percentage of weekly expenditure on:
To draw a pie chart, divide the circle into 100 percentage
parts. Then allocate the number of percentage parts required for each
item.
Note:
- It is simple to
read a pie chart. Just look at the required sector representing an
item (or category) and read off the value. For example, the weekly
expenditure of the family on food is 37.5% of the total expenditure
measured.
- A pie chart is
used to compare the different parts that make up a whole amount.
In pictograms, the data is represented by means of picture figures appropriately
designed in proportion to the numerical data.
· Line graphs represent the data concerning one variable on
the horizontal and other variable on the vertical axis of the graph paper.
A line graph is
often used to represent a set of data values in which a quantity varies with
time. These graphs are useful for finding trends. That is, finding a general pattern in data sets including temperature, sales,
employment, company profit or cost over a period of time.
A cylinder of liquid was heated. Its temperature was
recorded at ten-minute intervals as shown in the following table.
a. Draw a line graph to represent this information.
b. Estimate the temperature of the cylinder after 25 minutes of heating.
b. Estimate the temperature of the cylinder after 25 minutes of heating.
Solution:
Grouped data may be represented
graphically by histogram, frequency polygon, cumulative frequency graph and
cumulative frequency percentage curve or ogive.
· A histogram is essentially a bar graph of a frequency
distribution. The actual class limits plotted on the x-axis represents the
width of various bars and respective frequencies of these class intervals
represent the height of these bars.
· A frequency polygon is a line graph for the graphical
representation of frequency distribution.
· A cumulative frequency graph represents the cumulative frequency
distribution by plotting actual upper limits of the class intervals on the x
axis and the respective cumulative frequencies of these class intervals on the
y axis.
· Cumulative
frequency percentage curve or ogive represents cumulative percentage frequency distribution by plotting upper
limits of the class intervals on the x axis and the respective cumulative
percentage frequencies of these class intervals on the y axis.
METHOD
FOR CONSTRUCTING
A
HISTOGRAM
1. The scores in the form of actual class limits
as 19.5-24.5, 24.5-29.5 and so on are taken as examples in the construction of
a histogram rather than written class limits as 20-24, 25-30.
2. It is customary to take two extra intervals
of classes one below and above the grouped intervals.
3. Now we take the actual lower limits of all
the class intervals and try to plot them on the x axis. The lower limit of the
lowest class interval is taken at the intersecting point of x axis and y axis.
4. Frequencies of the distribution are plotted
on the y axis.
5. Each class interval with its specific
frequency is represented by separate rectangle. The base of each rectangle is
the width of the class interval. And the height is representative of the
frequency of that class or interval.
6. Care should be taken to select the
appropriate units of representation along the x and y axis. Both the axis and
the y axis must not be too short or too long.
In a histogram:
- The bars do not have gaps
between them.
- The width of the bars is proportional to the class intervals of data.
- The height of the bars represents the different
values of the variable.
- The area of each
rectangle is proportional to its corresponding frequency.
The area
of a histogram is
equal to the area enclosed by its corresponding frequency polygon.
METHOD
FOR CONSTRUTING
A
FREQUENCY POLYGON
1. As in histogram two extra class interval is
taken, one above and other below the given class interval.
2. The mid-points of the class interval is
calculated.
3. The mid point is calculated along the x axis
and the corresponding frequencies are plotted along the y axis.
4. The various points given by the plotting are
joined by lines to give frequency polygon.
Midpoints of the interval of corresponding rectangle in a
histogram are joined together by straight lines. It gives a polygon i.e. a
figure with many angles. it is used when two or more sets of data are to be
illustrated on the same diagram such as death rates in smokers and non smokers,
birth and death rates of a population etc
One way to form a frequency polygon is to connect the midpoints at
the top of the bars of a histogram with line segments (or a smooth curve). Of
course the midpoints themselves could easily be plotted without the histogram
and be joined by line segments. Sometimes it is beneficial to show the
histogram and frequency polygon together.
Unlike histograms, frequency polygons can be superimposed so as to
compare several frequency distributions.
This is a histogram with an overlaid frequency polygon.
Midpoints of the interval of corresponding rectangle in a
histogram are joined
DIFFERENCE BETWEEN HISTOGRAM AND
FRQUENCY POLYGON
Histogram is a bar graph while frequency
polygon is a line graph. Frequency polygon is more useful and practical. In
frequency polygon it is easy to know the trends of the distribution; we are
unable to do so in histogram. Histogram gives a very clear and accurate picture
of the relative proportion of the frequency from interval to interval.
METHOD
FOR CONSTRUTING
A
CUMULATIVE FREQUENCY GRAPH
1. First of all we calculate the actual upper
and lower limits of the class intervals i.e. if the class interval is 20-24
then upper limit is 24.5 and the lower limit is 19.5.
2. We must know select a suitable scale as per
the range of the class interval and plot the actual upper limits on the x axis
and the respective cumulative frequency on y axis.
3. All the plotted points are then joined by
successive straight lines resulting a line graph.
4. To plot the origin of the x axis an extra
class interval is taken with cumulative frequency zero is taken.
very informative. Thanx
ReplyDelete