Central Issues in Scientific Methodology
Robert D. Boyd
This chapter is not intended to cover all the major issues of scientific methodology but rather to discuss those that I have observed to be most frequently misunderstood. The topics considered here unavoidably overlap subjects discussed elsewhere in the book, because these issues are obviously interrelated. Nevertheless, I hope my particular perspective on them will prove useful to adult education researchers. My positions on these matters may not entirely agree with those taken by other contributors to the book, and thus I urge the reader to carefully examine the complementary and contrasting positions on logical grounds compatible with scientific investigation.
These issues are taken up in a number of scholarly and informative books, some of which are cited in the References. In particular I recommend two encyclopedic sources for the adult
educator who is interested in pursuing a personal
reading program in the nature, practice and problems of research:
Buros (1972) and
The two major interrelated problems in adult education research examined in this chapter are theoretical frameworks and methodological difficulties.
In discussing theoretical frameworks, writers usually give scant attention to the psychological and social motives that prompt research. And although such m0tives are legitimate subjects of inquiry in themselves, the focus of this chapter is not on why individuals engage in empirical studies but on how and what methods are used. These two aspects of research must be clearly differentiated.
Conjecture. The formal phase of scientific research begins with a conjecture, a universal statement that sets forth a speculative proposition. It is a universal statement, in contrast to a singular statement, in that it does not deal with a specific event, situation, or relationship. An example of a conjecture is this: Severe inflation in a democratic state during an election year brings the opposition party into power. Conjectures are not whims or flights of fancy; any conjecture proposed as a serious scientific explanation is grounded in some theoretical framework, which is constructed of logical assumptions and propositions and the careful specification of central terms. Frequently, but not necessarily, the framework is supported by empirically accepted data. As an illustration, consider the following hypothetical case.
An adult educator who has made use of small groups has begun to wonder whether he is making the best use of the technique. (Here is the motive, but it has no part in the formal considerations
involved in doing scientific research.) Thinking about his problem, he identifies his major concern: how to compose the best instructional groups. Since he has no sound hunches (no conjectures) to guide him, he decides to go to the literature. His reading reveals many insights, not the least of which is that existing research' does not provide a clear answer. Now challenged to seek answers by conducting his own research, he phrases his concern as a question: What membership composition will produce the greatest amount of learning, satisfaction, and self-esteem? The question defines the general area of the problem, but it does not provide a workable guide for doing research. What is needed is a conjecture, a statement about what the educator believes to be the best membership composition. From a reading of the literature he proposes that the best groups are those whose members complement each other.
Once the initial conjecture has been made, one might assume that the researcher is ready to begin developing his research design. But he had better be careful, for it is at this point that he may unwittingly fall into a fundamental methodological blunder: letting the theoretical framework dictate the method. This warning may seem contradictory in light of my argument that a theoretical framework is necessary. The question naturally arises, What difference does it make if the theoretical framework does dictate the method?
The theory of operant conditioning can be used to illustrate the answer. In this case the conjecture is that if a researcher provides positive reinforcement to a given sequence of behavior, he can shape a behavior pattern. To test the proposition, he sets up a situation in which he can control reinforcement and behavior. For instance, every time an adult student states his position on a value-laden assertion made by the researcher acting as instructor, the researcher praises the adult's behavior. All other responses are ignored. The researcher assumes that this method tests the theory of operant conditioning, but it does not test the theory or, more specifically, the conjecture set forth above. Why not? Because the method is synonymous with the theory: both the theory and the method propose that if you do x, you get y, but there is nothing in the method that eliminates the possibility that other factors as well as x can produce y. Thus if the proposition is that y can only be
brought about by doing x, the researcher should set up an experiment in which x is absent and test for the presence of y. More specifically, the conjecture should be stated as follows: Desired behavior patterns cannot be shaped in the absence of positive reinforcers. The method must test the conjecture and be in harmony with the theoretical framework on which the conjecture is based but never be synonymous with the conjecture.
Another way to put this problem is to say that the researcher should not try to verify the conjecture but to falsify it. The certainty of proof is illusory. All one can do is either show the conjecture to be false or fail to show that it is false. The method of falsifiability requires that the conjecture be stated in such a manner that it denies the possibility of a given outcome. This notion of falsifiability may be entirely new to the reader, and yet its central place in doing science cannot be ignored. Exploring the idea in depth would go far beyond the scope of this chapter, which can only sketch its outlines. Those who wish to pursue the principle in greater detail, can find a brilliant presentation in Popper's book The Logic of Scientific Discovery (1959).
If proof is necessary to verify a conjecture, and if proof is defined as sufficient evidence to establish truth, and if truth, in turn, is an absolute, then it logically follows that there is no proof in scientific enterprises, because there is no current means by which to assign an absolute quality to any human experience. But a researcher can be certain at a given time that there is evidence to show a statement to be false. This point may be stated in another way: If universal statements or conjectures are tested by means of singular statements (statements about events) that have been included in the proof of a given conjecture, one can never assert that the conjecture has been proven. One may, however, be assured that the researcher has not yet falsified the conjecture. Although this is a basic proposition, it is not as central to the present discussion as one that refers directly to method: A method must be established to test that which is denied by the theory rather than to look for that which the theory predicts. And thus preceding such a method must be a testable conjecture, a conjecture that puts a theory in jeopardy. The conjecture takes the form of a universal statement which denies the possibility of some condition or event.
This can be exemplified by the hypothetical adult education study described above, in which the researcher proposed that the most effective instructional groups would be those whose members complement each other. The definition or characteristics of complementarity must be based on some theoretical framework. Of course, the measure of most effective would have to be defined operationally as well, but for the purposes of the example only the operational definition of complementarity will be discussed. In this case the researcher decides to use the epigenetic theory of psychosocial development set forth by Erikson (1950) as the source of his definition. He then focuses on what he considers to be his major conjecture, asserting that those groups whose members are working on similar identity crises will be the most complementary and therefore highest-achieving, or "best," groups. With this conjecture, he also denies that a group constituted of members working on dissimilar identity crises will compose the highest-achieving group. In terms of my foregoing argument, the researcher should be equally interested in both types of group. It would be incorrect to think of the group whose members have dissimilar identity crises as the control group. Actually, both such groups are the experimental groups.
The use of a null hypothesis is in agreement with the idea of falsifiability. The problem with a null hypothesis is simply that it does not provide the severest test of the theory. The null hypothesis states that there will be no difference between E, the experimental population given treatment T, and El, the control population which does not receive treatment T. If no significant difference (defined in some acceptable manner) is found (either in favor of or against treatment T), that part of the theory which concerns treatment T is dismissed with respect to the particular variables examined in the experiment. There is no question that this is a form of falsifying a hypothesis. The more severe test, however, would be to deny that population El could achieve equality with population E. This is to say, the researcher would deny that population El could produce the results which treatment T is designed to develop.
In this connection, I should clarify the difference between a conjecture and a hypothesis. I have spoken of the former as a universal statement, but a hypothesis is also a universal statement.
The difference is not in any quality of universality but in the quality of refinement. A conjecture is a broad statement setting forth a theoretical proposition. A hypothesis sets forth the specific criteria with which one determines whether the statement has been falsified or corroborated. The terms conjecture and hypothesis are used with these meanings throughout the chapter.
Relation of Concepts and Methods. Theoretical concepts are not insular; their meanings arise from and exist in particular theoretical frameworks. And yet the research literature is replete with examples of researchers who are investigating concept x moving back and forth among findings from studies based on theories A, B, and C, without attempting to resolve the ontological and epistemological conflicts that clearly exist among the theories. The problem is evident not only in the work of novices but in the work of recognized scholars such as Ausubel (1968). In discussing the concept of anxiety, Ausubel attempts to relate findings from research on such diverse theories or fields as neobehaviorism, cognitive psychology, field psychology, and ego psychology. He makes no attempt either to reconcile the conflicts among the salient assumptions of the various theories or to examine the different specifications of anxiety.
To avoid such a basic error, one must examine two central points. First, one must realize, and make decisions on this basis, that concepts are, as I have said, theoretical constructs and as such are inescapably embedded in a theoretical framework. A researcher must not be deceived by believing that because the concepts are presented using identical words, their meanings are also identical. Concepts do not transcend theories. And second, the researcher must therefore take great care in selecting instruments based on theories different from the theory he is employing in structuring his project. He must avoid choosing an instrument that manifests a conflict between the assumptions of the theory on which the instrument is based and the assumptions of the theory used in his study.
These two points can be readily illustrated by means of the case of the hypothetical researcher presented above. In that case, the final conjecture was that groups composed of members working on dissimilar identity crises will reach a lower level of achievement
than groups whose members have similar identity crises. Now his problem is to identify the means of collecting data. To do so, he needs to determine the identity crises of those participants who will be included in the study. The researcher may first think of an interview as a means of collecting data. He may believe that by developing an interview schedule designed to get at aspects of each of the eight epigenetic stages, he might get his subjects to provide the necessary information. Aside from the matter of rapport and the need for carefully structured questions in the interview schedule (discussed later), the issue is the compatibility between instrumentation and theory. The interview initially appears to have merit, because adult educators know from past research that adults generally are more receptive to an interview than to a pencil-and-paper instrument. However, the proposed procedure has a basic flaw: the assumption that the participants are able to reveal their identity crises to the interviewer through self-examination. Such an assumption ignores the theory of the unconscious, which is an essential aspect of ego psychology, on which Erikson's epigenetic stages are based. The researcher must employ a data-collection technique that takes into account the unconscious state. To make use of any instrument that fails to account for unconscious motivation immediately jeopardizes the validity of the data. Again, there must be complete harmony between the data-gathering instruments and the theory which serves as the framework for the study. The key point in this example is that whatever method of data collection is employed, it must allow for the theoretical assumptions related to the variables which are being measured or it will be invalid theoretically.
Reliability and Validity. Any serious discussion of method leads to the issues of reliability and validity, because of the central questions these two ideas raise in scientific research. The need for reliability focuses the researcher's attention on matters of stability and dependability. The word stability here refers to obtaining similar data over time. Dependability is used in the sense of trustworthiness; that is, the observations of judge A are to be counted on as being accurate within some acceptable limits.
Validity is an elusive idea. The issue specified by the term validity is whether or not the data in some instances can be presumed to be empirical evidence of the variable being examined. The researcher may argue on the grounds of four kinds of validity: concurrent, construct, content, or predictive validity. Concurrent validity has to do with two sets of data, one of which has already received some acceptance as being what it is supposed to be. Construct validity concerns a theoretical relationship between the variable under study and another variable that has a better established empirical base; the validity of the first variable is asserted to rest on the existence of the conjectured relationship. Content validity, sometimes referred to as face validity, is based on establishing an isomorphic relation between, for instance, a test item and the content from which the item was drafted. Predictive validity has to do with the correspondence between the scores on the variable proposed on a predictor and subsequent scores on the variable of predicted performance; that is, between scores on a representative sample taken at one time and a performance sample taken later.
This introduction to reliability and validity is intended only as the barest reorientation. If the review confuses more than it enlightens, the reader may find it helpful to read one of several good references, such as Cronbach (1969), Festinger and Katz (1953), Selltiz and others (1959); and Stanley (1969). The discussion here takes up the specific, issues researchers encounter in considering questions of reliability and validity in actual empirical studies.
As indicated, reliability connotes stability and dependability. The need to be concerned about stability does not mean, however, that all data must be stable in the sense that over an extended period a researcher will get the same answers to the same questions. If one holds that human development is manifested in different sets of interests, it would be illogical to expect certain human-development variabilities to be stable, since they are expected to change. The important point is that stability must be viewed within the theoretical framework on which the study is based. If one specifies that the variable x within the theoretical framework is constant, then
one would expect that two administrations of a self-report instrument, at time A and time B, would produce very similar data. However, one should not conclude from this discussion that a researcher can dismiss the issue of stability when a developmental theory is employed. The investigator is still responsible for reporting on the stability of the data. His or her task is to specify the limits of the period in which the same answer should be expected to be given to the same question. The researcher's specification is based on the theoretical framework of the study. There are certain established ways to examine the stability of the data; the means employed depend on how the data are related to the conjecture being tested. My objective here is simply to point out the methodological obligation of the researcher.
If the conjecture posits little room for error, the reliability test the researcher uses should permit only a small margin of difference between the sets of data. For example, in the study of instructional groups, one might conjecture that the ideas of low-status members will never lead the group toward new discussion topics. The judges employed to identify the initiation of new topics within instructional groups should reach total agreement if the data are not to be contaminated by differences in the coders' judgments. Just as the definition of stability completely reflects pertinent aspects of the study's theoretical framework, so the selection of reliability tests must be made on the basis of meeting the data demands of the conjecture. There is a congruent relationship among the concept of stability, the reliability test, and the theoretical framework. And the researcher is obligated to develop that relationship in his research project. Particular procedures for examining reliability can be found in such valuable sources as Cronbach (1969) and Guetzkow (1950).
The issue of the dependability of the data brings the researcher directly to the problem of possible false answers by a respondent. Dependability, as I pointed out earlier, is an aspect of reliability. Can one depend on the data's being accurate? If an individual from whom a researcher is collecting data chooses not to provide accurate information, then of course the data will not be accurate. That is a serious problem. The researcher, however, should not immediately conclude that an individual has supplied
a false answer simply because the researcher finds conflicting pieces of data.
For example, if at a given time an individual reports that he had difficulty with mathematics, the researcher may expect that adult to say that he also had difficulty with arithmetic. If the adult fails to do so at a later time, the researcher may conclude that the adult was not consistent in the answers he gave and obviously was providing a false answer. The researcher's conclusion may be correct. However, before accepting it, the researcher needs to consider the following questions: (1) Does the participant view x (arithmetic) as a subset of y (mathematics)? (2) Are the questions phrased identically, and does the subject perceive them as being identically phrased? And (3), is there a possibility that the participant's original answer was insufficiently differentiated? (As he thought about it, he began to see that the difficulty he had experienced was with algebra, and he actually even enjoyed geometry and arithmetic.) The point is that the investigator should not immediately perceive one of two conflicting answers to be purposely false. The problem may be in the structure of an interview question or a questionnaire item.
This example illustrates the reason for pretesting all data-gathering devices. In scientific research, the question of the credibility of data-gathering instruments and procedures is crucial. Another essential aspect of credibility is the question "Are the data that the researcher proposes to collect directly related to the conjecture?" This, of course, brings us back once again to the issue of validity. Validity is a more fundamental problem than reliability, because the reliability of the data would not. matter if the data could not be used in a direct test of the conjecture.
Earlier I identified four kinds of validity: concurrent, construct, content, and predictive. Each is influenced by some common factors, and although I discuss these factors in relation to content validity, I want to remind the reader that they are also important in connection with other kinds of validity.
As the researcher concerns himself with instrument validity, he must examine how well a sample represents some general class. Thus, an achievement-test item that has been drawn from part of the subject matter of a course is assumed to represent the cluster of
information from which the item was drawn. Understanding the appropriate use of content validity in the development of instrumentation is essential. For example, let us return to the earlier conjecture that instructional groups whose members are working on dissimilar identity crises will reach a lower level of achievement than groups whose members have similar identity crises. Using Erikson's theory, the researcher derives sets of specifications for eight epigenetic identity crises and, with much work, develops statement items that express those unique specifications. Each of these steps must be subjected to tests of validity. In the first step the researcher writes the specifications and then must determine whether these are correct in terms of the theory from which he has drawn them. Accordingly, he gives the specifications to two or more judges, who decide on their correctness. If the judges are in total agreement, the researcher is ready to move on to the second step--testing the statement items developed from these specifications. This time the researcher asks the judges to determine whether each item is (1) an aspect of the eight crises, (2) classified in the category for which it was designed, and (3) exclusive-that is, it cannot be classified in more than one category.
Clearly, the production of an instrument, as far as content validity is concerned, is generally a very difficult undertaking. And thus it is naive to believe that the first attempt will produce success. In the case study discussed above, the investigator used judges to determine content validity. This is a common practice, and one that raises two issues that should be examined before I extend the discussion of validity. One issue has to do with the word objectivity and the other with intersubjective. The two are interwoven, but I shall discuss each separately at first in order to highlight specific points.
Objectivity and Intersubjectivity. The pursuit of objectivity is based on the same order of mistaken conception as the pursuit of proof. The idea of objectivity, as if it were above or outside or separate somehow from people, is a fantasy which has too long intruded into the efforts of persons attempting to do scientific research. All data, in the final analysis, rest on interpretations, and interpretations are the product of individuals. This may strike the reader as an extreme position. Yet there appears to be no way of
escaping the conclusion that the interpretation of data depends on the agreement that can be achieved among knowledgeable persons. What was accepted as objective truth at the time of Newtonian physics is, of course, rejected today. The objectivity was based on Newton's ability to convince a group of knowledgeable persons that his interpretations accounted for more acceptable observations than did the existing theory. Two educational psychologists can perceive what may be assumed to be a mutually observed event. And yet their interpretations may differ fundamentally. A psychologist who upholds operant conditioning will not see the same event as a cognitive psychologist. This is so, at least in the manner in which each describes the event.
Reaching the truth or being objective is also a problem faced by legal systems. In British and American law, the solution adopted is in essence identical to that which has been accepted by the world of science: the mechanism of the jury. In both cases the criteria are similar. The jury must be thoroughly informed or knowledgeable, must possess the attitude of open-mindedness, and must apply the test of falsifiability to all conjectures and proofs set before it.
The reader may not, as yet, see the need of questioning the status of objectivity. He may believe that objectivity is determined through carefully structured definitions or through mechanistic devices. But anyone who believes that one can achieve objectivity through such means is either failing to see the situation for what it is or refusing to accept it. The pursuit of definitions can be readily shown to be rooted in operationalism. The doctrine of operationalism has merit if it is not pushed to its logical extension. Definitions must be established at some level by a set of conventions agreed upon by a jury of knowledgeable persons; otherwise, the problem becomes. one of definitions ad infinitum. In somewhat poetic terms, the problem is that there is no means known by which one may escape from the boundaries of his being. This problem and others of a philosophical nature are treated extensively in the following references, which deal with scientific explanation: Berger and others (1962), Dewey and Bentley (1949), Kaufmann (1968), Lakatos and Musgrove (1970), Mandler and Kessen (1959), Popper (1959), Sherwood (1969), and Stogdill (1970). Likewise,
the use of mechanical instrumentation does not provide a way out of the problem. Few, if any, researchers in the field of adult education employ such gadgetry as galvanoscopes, electroencephaloscopes, and impulse recorders. But if an investigator were to do so in the belief that this would bring an objectivity that eliminates the human dimension, he need only be reminded that the reading of the records made by these machines is done by humans. So at this point the researcher again faces the problem of definitions. In addition, this person should not overlook the intersubjective agreements that were required in determining the utility of constructing mechanical and nonmechanical data-collection instruments.
Earlier I argued that although there is no way yet known to identify what is true, there are means by which to recognize what is false. Reality is similar to truth in this respect. As one cannot be certain of knowing truth, one cannot be sure of knowing reality. The establishment of what may be termed reality depends on public agreement on a set of criteria and on the sharing of what may be provisionally called mutual observations (Dewey and Bentley, 1949). Since that idea is fundamental to understanding my position on reality and to the acceptance of intersubjective reality, it should be carefully reexamined. In the briefest terms, no one person has a special talent for identifying and knowing reality. What sense of reality there is comes into being by intersubjective (people-to-people) agreement on a set of criteria. For example, in scientific research one criterion is the establishing of conjectures that can be falsified. Observations corroborate or falsify one's sense of reality. But reality, as far as knowing it is concerned, is arrived at intersubjectively. Therefore, the reader-researcher should focus on developing good intersubjective agreements. .
Categorization. The methodological issue for scientific research becomes the level of intersubjective agreement that is appropriate for the conjectures that are being tested. However, regardless of the level which is proposed as acceptable, certain criteria must be met in structuring one's data-gathering procedures. Categorization is basic to any form of intersubjective agreement that may be reached. In structuring categories the researcher must face and handle the demands of two criteria-the exclusiveness and inclusiveness of data.
The criterion of exclusiveness is met when one can demonstrate that the characteristics defining any single category do not overlap the characteristics of any other category within a given category system. In category A the characteristics A1A2...An are not included as characteristics of category B, which has its own attributes B1B2...Bn. The criterion of inclusiveness demands that the category system applied to a specified type of data account for all the data within that category system. For instance, to specify the steps in problem solving, one must be able to classify all problem-solving behaviors.
In the development of a category system, a researcher must draw upon a theoretical framework to establish the boundaries and to identify the distinguishing characteristics of each class. The defense of these boundaries is based on the specifications derived from the theory. Thus, the next step is the development of the exclusive categories. It consists of submitting the specifications of the categories to a set of judges whose task is to determine whether any of the categories overlap. If they find no overlapping, the categories have met the criterion of exclusiveness.
The criterion of inclusiveness is satisfied in a similar manner. Again, starting with the theoretical framework, the researcher develops a category system which, in effect, he asserts accounts for all behaviors of a given type. There are two points at which the assertion of inclusiveness may be challenged. First, observations by other researchers may suggest categories that are not included in the system. For example, the theory of operant conditioning does not account for repression as a mode of forgetting. Those psychologists who take the position that they can demonstrate repression reject the category system proposed by proponents of the operant-conditioning theory. The second point at which the assertion of inclusiveness may be challenged is in the actual use of the category system. In this phase of testing the category system, the researcher's policy should always be to provide the judges with an empty-cell category. That is, if there are ten categories in the system, the judges should be provided with an eleventh cell for those data they find do not fit into anyone of the ten designated categories. The judges are encouraged to try to falsify the category system by identifying data for all eleven cells. If the judges identify data for each of the
ten cells and none for the eleventh, then the researcher assumes that the category system has met the criterion of inclusiveness.
The procedures outlined above are the minimum essential steps to be taken if an attitude of scientific inquiry is to be reflected in method. Specifically, a researcher's obligation is to attempt to provide the means by which to falsify his research decisions, for this is the only tenable scientific stance that is at present open to anyone who attempts to provide scientific explanations.
Validity of Data. The discussion so far has treated the issue of content validity in relation to the instruments a researcher may use in a research project. The validity of instruments is one aspect of the issue of validity, but it does not concern the validity of the data in relation to the conjectures being tested. That is, the data that are being collected must provide evidence to test the hypotheses that have been proposed. The instrument that a researcher proposes to use may have a high degree of validity but be inappropriate for testing the study's hypotheses.
The significance of data validity must be stressed in a discussion of method for researchers in adult education. A survey of research studies in education will reveal numerous instances in which the investigator attempted to prove a causal relationship between treatment A and learning B when, in fact, a host of other factors bear heavily on the outcomes. The researcher may try to handle these factors by randomization and establishing experimental and control groups. Even then, most studies fail to identify a significant difference. One simple explanation is that students who want to learn will do so, even against barriers which have been accidentally or purposely established. However, suppose that a significant difference between the experimental and control groups is found. Does such difference therefore provide corroborative evidence for the conjecture that group achievement improves with the treatment? The reader should quickly identify a design limitation. Randomization is a means to overcome the confounding effects of variables that cannot be systematically controlled or whose effects cannot be experimentally determined. The problem faced by many research designers is that randomization cannot be employed because they are only interested in investigating the differences between two specific and small groups.
This discussion should not lead the reader to conclude that I oppose correlational and similar studies in adult education. The purposes and values of correlational studies are not the focus of attention here. I want to note in passing, however, that correlation studies are most helpful in giving both direction and refinement to certain kinds of inquiry at the early stages of their development. The central point that must not be lost sight of is that a correlational study cannot yield explanations of causal relationships. The fundamental issue is the researcher's responsibility to demonstrate clearly the validity of the data in terms of the direct testing of the conjectures.
Content validity has been examined in relation to several basic issues. The other types of validity have received little attention, although the issues that have been examined apply as well in most cases to concurrent, construct, and predictive validity. The many topics covered by these various types of validity will not be pursued in depth in this chapter, but a brief discussion of construct validity will serve to highlight a basic issue that has been viewed from another perspective earlier in this chapter.
Construct Validity. Construct validity is rarely discussed in adult education research reports. This is unfortunate, as it is a powerful and theoretically important type of validity. The question may then be raised, Why is it not discussed more often? Although there has been no systematic effort that would answer the question, one might suggest an answer from a cursory reading of the adult education research literature. Such an examination reveals that many authors of empirical studies give little or no attention to examining, testing, or developing theoretical frameworks. These researchers launch an investigation of a certain type of phenomena without reporting. their place in the continuity of testing and developing theories. Even in the published reviews of articles on empirical studies the authors fail to identify the theoretical threads which tie the series of studies to a conceptual framework. The persons who carry out these empirical studies appear to assume that pieces of information produce knowledge and to forget that the empirical data are at best only a proxy for a mental construct which is an element of a theory. But to assume that knowledge is produced in an additive manner is to overlook a very simple but fundamental
proposition: ideas, facts, insights, and awareness are put together only when one or more organizing principles exist. The phenomenon of forgetting the existence of unhappy events was well known before Freud, but it was Freud who suggested the organizing principles of repression. And the works of Bruner, Goodnow, and Austin (1962), Piaget (1960), and Rapaport (1959), among others, also argue the existence of organizing structures.
Although the failure of adult education researchers to seek this type of validity is regrettable, its limited appearance in the literature points to a more serious issue, which I have already referred to. Many research studies in adult education are reported as if they had no ties to a theoretical base. This failure to link an empirical investigation to a theoretical framework may indicate that the investigator is unaware of the relationship between his study and other studies related to the same theoretical base. The absence of such a link constitutes a hiatus in the advancement of cumulative knowledge.
Instrument Development and Testing. The development and testing of instruments is another area that deserves attention in this chapter. It is a vast subject on which many excellent works have been written, and those who wish to do research should take the time to familiarize themselves with the literature. Among these works are Borgatta and Bohrnstedt (1970), Cronbach (1969), Festinger and Katz (1953), Gage (1962), Lindzey (1960), Selltiz and others (1959), Stephenson (1958), Thorndike (1971), and Travers (1964).
A researcher often discovers that the particular type of data-gathering procedure he requires has not been reported in the. literature. He then faces a crucial decision: should the project be abandoned because appropriate instruments do not exist, or should he push on and attempt to develop the needed instruments? The decision should receive very careful review. If the researcher contemplates the development of instruments, he should first seek the advice of competent persons. The plural, persons, is used advisedly. The value of jury decisions has been pointed out elsewhere in the chapter. The rigorous development of instruments for data-gathering purposes necessitates tests of reliability and validity. In addition, the researcher must argue the case that the newly developed instru-
ments gather the appropriate data to test the conjectures. These steps demand a great deal of work and should not be taken without full awareness of the effort required.
Data may be collected in several modes. Observations, interviews, historical analysis, self-reports, and projective techniques are the major types available to adult education researchers. Observations, certain types of interviews, and projective techniques make use of judges or coders. When using coders, the researcher must collect information on their training, on their competence, and on intercoder reliability. He must have acceptable assurance that they know what they are doing, that their performance is up to a conventional standard, and that their biases are not so discrepant that each is giving a different interpretation to the same set of data. The procedures involved in selecting, training, and assessing the performance of judges and coders are both costly and time consuming.
Another way to gather data is to have the participants report on themselves by means of either direct self-reports or one or more projective techniques. Self-report formats fall into two major classes, namely, questionnaires and S or Q sorts. Among the many questionnaire types are the Likert Scale, dyadic and multiple-choice, sentence-completion, ranking, and matching-sets questionnaires. Each has advantages and disadvantages that are determined by the given aims, situations, participants, and research resources. Sand Q sorts can also be structured in a number of ways. Not only may distributions be modified, but various steps may be introduced to determine the types of sorting that may be done. At present the Likert Scale questionnaire has a statistical procedural advantage over the other self-report instruments. Computer programs provide simple and direct methods by which to check the interitem reliability of an instrument. This is a feature that cannot be readily ignored. Other interitem reliability procedures are available, but there is none at present that can be applied to data from an S or Q sort that employs a normal distribution. The difficulty can be handled in a reasonably defensible manner. S- or Q-sort items can be structured in a Likert Scale schedule. Data from the administration of this schedule can then be submitted to the interitem reliability programs to determine the interitem reliability of the item statements. Although
Central Issues in Scientific Methodology
the need to clearly specify the variables in any study has been stressed previously, it may be of value to review this point in the context of developing a self-report instrument. Clear specification of each variable is essential to the construction of any self-report instrument. When the items-that is, the individual statements have been written, they must be submitted to a jury of knowledgeable judges to determine their inclusive and exclusive properties. Undoubtedly, some items will have to be reworked, others discarded, and new items developed. The instr1!ment should not be considered completed until the judges agree about each item.
This discussion of data-gathering devices has only touched on a selected and limited number of issues dealing with instrumentation. In conclusion, I want to stress that although rigorous examination of research instruments is important, it would be sad and unfortunate if such concerns were to frighten off attempts at creative efforts.
Two major problems are evident in far too many empirical studies reported in the educational-research literature. The first is that researchers fail to establish an appropriate theoretical background for their studies. Consequently, they fail to integrate their studies in coherent and consistent conceptual frameworks. In adult education the problem is even more severe. Studies are proposed without any reference to a theory. Inevitably, it is there implicitly, and in most cases, the unstated framework is an assortment of conflicting assumptions. The second problem concerns a host of difficulties with methods. If adult educators are to make their scientific research more sophisticated, they will have to address these two problems.
Return to the top of this page.
Return to the first page.