Skip to content
Penn State University Libraries

Focus on Assessment - Jan 11, 2010

Understanding reliability and validity

By Greg Crawford, Ph.D.

When doing any assessment or research, especially if you are using a survey or some other
research instrument, two concepts must be kept in mind: reliability and validity. Reliability is the
easier concept to understand. For an instrument or survey to be reliable, it must give consistent
results. For example, if you use the same thermometer to measure your temperature several times
when you are feeling well, you should get a similar reading. Similarly, if you have the same
group of people complete the same survey multiple times, their results should be the same each
time. But, beware! Even though a survey or research instrument yields a consistent result, it
could be recording incorrect results, just as your thermometer could be broken and still give you
a consistent reading. Thus, it could be consistent, but not valid.


Validity is a more difficult concept which addresses two questions. First, are you measuring what
you think you are measuring? Second, how accurately are you measuring what you think you are
measuring? For example, in libraries, one measure of use, the number of items that are checked
out, is a highly valid and reliable measure, since it does measure one specific type of use, i.e.,
taking the material out of the library. Measuring other types of use, however, can be fraught with
errors. Libraries often count books as being used if they are placed on a book truck for
reshelving. Sometimes the books have been used, but sometimes they have not been. It all
depends on your definition of use. For example, did a student actually read a chapter before
putting the book on the truck? Did they use the index and not find what they were looking for?
Or, did the book fall off the shelf and was put on the book truck simply for reshelving?


To help clarify their research, scholars generally assess several types of validity. Content validity
examines the content of the instrument to determine if that content is appropriate and
representative of what you are seeking to measure. Predictive validity seeks to determine the
degree to which one measure can predict a second measure. Concurrent validity examines the
correlation of what you are measuring with another known variable for which you have a valid
measurement. Construct validity seeks to understand the relationship of your measure to the
theory with which you are working. Internal validity seeks to determine the relationship between
your study’s program or intervention (for example, an instruction class) and the actual outcome
observed to determine a causal relationship, if one exists. External validity refers to the ability to
generalize the results of a study’s intervention or program to other settings or to other groups.
Each of these types of validity can be difficult to measure, but considering each one in the design
of your study and your research instrument, especially a survey, is crucial.