"I Do and I Understand[3]:"
Mastery Model Learning for a Large Non-Major Course

Mark Urban-Lurain

Michigan State University

urban@msu.edu

 

Donald J. Weinshank

Michigan State University

weinshan@cse.msu.edu

 

1. Abstract

We describe the infrastructure we have created for a large enrollment (1800 / semester) non-major course. The course combines collaborative, lab-based problem-solving instruction with modified mastery-learning assessment. The infrastructure supports continuous improvement to respond to client department needs, incoming student experience, course design flaws and computing platform upgrades.

1.1 Keywords

Non-majors; Web-based techniques; assessments; large enrollment; mastery learning

2. Introduction

Designing instruction to provide "computing literacy" for non-Computer Science students is a challenge faced by computer science departments in many colleges and universities [7, 10, 6]. To address these issues in the undergraduate curriculum, faculty in the Computer Science department at a Big Ten University recently designed a new introductory service course for non-Computer Science majors. We first determined the needs of the various client departments by conducting a series of structured interviews with the chairs of the 67 departments whose students will enroll in the course. Some themes we identified are:

As a result of these interviews, we designed an Introductory Computer Science course that addresses both the computing skills desired by the client departments and the CS conceptual underpinnings of these skills that we believe are necessary for long term competence. This course was offered for the first time during summer, 1997. Full implementation for 1800 students per semester began in fall, 1997.

When we designed the instruction for this new course, we reviewed the courses it was replacing to determine what had worked and what had failed in the instructional design of these courses. We identified two major design limitations:

  1. The courses were monolithic: any change in course content required a change in course delivery and infrastructure. For example, one course was based on Visual Basic for DOS, which was state-of-the-art when we introduced the course in 1992. By 1996, however, this language was officially unsupported by Microsoft.
  2. The assessment and evaluation were heavily dependent on surrogate measures such as objective exams and quizzes that tested static recall. Less than half of the assessment required students to demonstrate performance in a computing environment.

These factors lead us to a course design that:

All of the above infrastructure components are "content-free." We update and revise the course content every semester without altering the course infrastructure. Thus, we continuously change the course content in response to changing client department needs, student background and experience, design flaws and computing platform changes.

3. Performance-Based assessment

Due to large enrollment (approximately 1800 students per semester) and institutional resource constraints (two full time faculty and a large number of Teaching Assistants), cost-effectiveness of the instructional design is a major consideration. Performance-based assessments often require a larger amount of labor to create, administer, and evaluate than do traditional multiple choice or computer-based training exams [9]. In order to optimize the use of resources and to ensure consistency of evaluation of bridge tasks while working with Teaching Assistants from a wide variety of backgrounds, we considered several factors:

In this paper, we outline the design of this software infrastructure. We discuss how this system is used to create, administer and assist in the evaluation of the BTs. Finally, we discuss the outcomes from the first full year the course was offered.

3.1 Creating Assessments

Figure 1 shows the logical structure of a BT. Each BT has a series of (M) dimensions. Each dimension consists of (n) instances. An "instance" is the smallest unit of text that can define a problem, sub-problem or a task that is representative of that dimension. For example, we have a dimension to test the concept of using Boolean operators to narrow searches in databases. Within that dimension, there are about 30 instances. Each instance requires a student to search the appropriate database to locate citations appropriate for a particular problem. Depending upon the problem, each search requires not only different search terms but also the use of different combinations of Boolean operators.

Figure 1

A single BT is thus constructed by randomly selecting a single instance from each dimension. Thus, each student receives different BTs. In addition, instances are selected so that repeated BTs do not repeat instances that a student had on previous attempts for this BT.

3.2 Delivering Assessments Via the Web

All of the BT instances are stored as HTML in a SQL database. For each student, the database combines the individual instances for each dimension for that student's BT and generates a custom Web page. Bridge tasks are delivered in real time to the student's computer screen for administration in a supervised laboratory.

3.3 Evaluation Criteria

Evaluating the performance of 1,800 students' BTs requires maintaining a stringent set of evaluation criteria for each of the instances in the BT database. We accomplish this by maintaining a set of database tables containing evaluation criteria in a one-to-many relationship for each of the BT instances. This relationship is depicted in Figure 2.

Figure 2

Each instance of a BT dimension usually consists of more than one sentence – often a paragraph or more – of text because the dimensions must be combined randomly and still make sense. While the granularity of multiple sentences of text is sufficient for creating the BT text, it is too coarse for the specification of the grading criteria. To maintain a high degree of inter-grader reliability and to reduce ambiguity of judgment on the part of the grader, we specify the pass/fail criteria with a still finer degree of granularity.

Each instance of every dimension in the bridge task has associated with it one or more criteria. Each criterion is designed to have the finest possible granularity so that the graders can quickly and consistently evaluate that criterion as pass or fail. The pass/fail results for each of the criteria are combined to determine whether or not a student has passed a particular dimension. The pass/fail results for each dimension are then combined to determine whether a student has passed or failed the BT as a whole.

In general, the goal of mastery model evaluation is for the student to demonstrate mastery of a body of knowledge. However, we do not define mastery as meaning that the student has to pass all criteria and all dimensions. To allow for minor student errors that do not indicate a lack of understanding, both dimensions and criteria can be either mandatory or optional. Students must pass all mandatory criteria to pass that dimension. If a dimension has optional criteria, students must pass some portion (e.g., 3 out of 5) of the optional criteria together with all mandatory criteria to pass the dimension. Likewise, dimensions can themselves be mandatory or optional. Students must pass all mandatory and some portion of the optional dimensions to pass the BT.

For example, suppose a BT were composed of the three dimensions shown in Figure 2. Instance i of dimension 1 has the criteria shown in the figure. Assume that criteria i and i+1 are mandatory and criteria i+2 and i+n are optional, with a requirement for passing one of the two optional criteria. To pass dimension 1, the student must pass criteria i and i+1 and pass at least one of criteria i+2 and i+n.

This hierarchical evaluation provides a flexible, general purpose system for creating, maintaining and updating assessments as the course content evolves.

4. Outcomes

We offered the pilot version of this course during the summer, 1997, semester to about 170 students and, during the 1997-98 academic year, to around 3500 students. This discussion reviews the results from Fall 1997 (N=1680) and Spring 1998 (N = 1770). There are five BTs (1.0, 1.5, 2.0, 2.5 and 3.0). Students may take one BT per week, for a total of 12 attempts to pass the five BTs during the semester.

4.1 BT Repeat Rates

Figure 3 shows the repeat rates for each BT for FS97 and SS98. Each column shows the percentage of students who eventually passed each BT. Note that approximately 95% of the students do pass the 1.0 BT, while just over 60% of students eventually pass the 3.0 BT. This is because students must pass a given BT before attempting the next BT.

Each BT column shows the percentage of students who passed that BT on the first try, second try, and so on. Notice that most students who pass any BT do so in the first two attempts.

Figure 3 also demonstrates how we are able to use the data from the BTs to improve both the instruction and the BTs themselves. Based on our analysis of the FS97 data, we revised the course design, instruction and BTs themselves for SS98 to rebalance the difficulty and pacing of the materials. As a result, the repeat rates for the SS98 BTs are more uniform, with about half of the students who ultimately pass any given BT doing so on the first try. For example, only about one third of the students passed the FS97 2.0 on the first try, while about three quarters of the FS97 students passed the 3.0 on the first try. By way of contrast, during SS98, about half of the students passed the 2.0 and the 3.0 on their first attempt.

Figure 3

Figure 4

Figure 4 shows the distribution of final grades for both fall 1997 and spring 1998. The overall GPAs were 2.98 (FS97) and 3.00 (SS98). Recall that students must pass the 3.0 BT before they may receive credit for the semester project, which may increase their grade to a 3.5 or 4.0. The nearly linear relationship among the grades determined by the BTs (0.0 through 3.0) is what one expects for a mastery model course in which student effort is the key factor in determining grades.

To those accustomed to a normal distribution, these results may be startling. However, as Bloom, Madaus and Hastings [2] point out:

If we are effective in our instruction, the distribution of achievement should be very different from the normal curve. In fact, we may even insist that our educational efforts have been unsuccessful to the extent that the distribution of achievement approximates the normal distribution. (p. 52)

4.2 Student Ratings

We obtained feedback from the students on the university’s Student Instructional Rating System (SIRS) and via E-mail and personal interactions with the instructors and teaching assistants. Table 1 shows the results from some of the course questions asked on both the fall and spring SIRS. Students are asked to respond to each statement on a scale of 1=strongly agree, 2 = agree, 3 = neither agree or disagree, 4 = disagree, 5 = strongly disagree.

Question

Mean

Fall 97

Mean

Spring 98

I usually did my homework before coming to class.

3.47

2.23

I learned a lot in the group exercises in class.

3.79

3.15

The Bridge Tasks were a fair test of the material I learned in class.

3.50

2.29

I felt that the "extension tasks" on the Bridge Task, which asked me to do something I had not previously done, were reasonably connected to what I had already learned.

3.68

2.32

The grading was fair for the Bridge Tasks.

4.29

2.98

The grader's comments explained what I did wrong on the Bridge Tasks.

3.51

2.55

I feel that my course grade will reflect my understanding of computer concepts.

3.88

2.80

I would recommend this course to my friends.

3.70

2.48

Table 1

On every question, the student ratings improved dramatically. The average overall improvement is 1.13 (on a scale of 1 to 5). All results are very highly significant by two-tailed T-tests. We attribute these improvements to the extensive revisions of the course and BTs we made from fall to spring.

5. Discussion

We see several advantages to this assessment model.

Now that the course has been offered for a year we perceive three changes:

  1. We have a firmer grasp of methods for explaining this model to students. For instance, at mid-semester, we discuss with the students the previous semesters’ performance at mid-semester vs. final grades the students received. This helps current students understand how their midterm standing is likely to translate into a final grade in the course. This motivates students to prepare thoroughly for their remaining BTs to maximize their chances of success.
  2. Faculty and student advisors across campus now understand the model and explain it to students when advising students.
  3. In the student culture, the "word on the street" prepares the incoming students for the instructional and assessment model.

Students have mixed reactions to the course. For some, this new assessment protocol is puzzling because they are thoroughly imbued with the model of "getting points," irrespective of what these "points" measured. For others, the fact that BTs can be repeated without penalty leads to a subtle trap: they fail to take into account the fact that the course is of fixed duration. Therefore, toward the end of the semester, they have to take BTs at a fairly rapid pace to raise their grades to levels they considered acceptable.

On the positive side, most students become independent learners, turning naturally to the help system, their textbooks and their peers when confronted with a problem, rather than simply raising their hands for help. By the end of the course, more than half of the students can do very competent semester projects on topics appropriate to their majors, suggesting strongly that they will carry forward what they have learned in this course to future courses and careers.

6. References

1] Block, J.H., Efthim, H.E., Burns, R.B.: Building effective mastery learning schools. Longman, New York, 1989.

[2] Bloom, B. S., Madaus, G. F., and Hastings, J. T., Evaluation to improve learning. New York, NY: McGraw-Hill, 1981.

[3] Confucius, "I do and I understand," in http://www.cyberstory.com/Education.html, 500 BC.

[4] Johnson, D.W., Johnson, R.T., Holubec, E.: Circles of learning: Cooperation in the classroom. Interaction Book Company, Edina, MN, 1990.

[5] Johnson, D.W., Johnson, R.T., Smith K.A.: Active learning: Cooperation in the college classroom. Interaction Book Company, Edina, MN, 1991.

[6] Joyce, D., "The computer as a problem solving tool: A unifying view for a non-majors course," presented at Twenty-ninth SIGCSE technical symposium on computer science education, Atlanta, Georgia, 1998.

[7] Kolesar, M. V., and Allan, V. H., "Teaching computer science concepts and problem solving with a spreadsheet," presented at Twenty-Sixth SIGCSE technical symposium on computer science education, Nashville, TN, 1995

[8] Osin, L., Lesgold, A.: A proposal for the reengineering of the educational system. Review of Educational Research 66, 1996, 621-656.

[9] Schoenfeld, A.H.: Toward a unifying framework for assessment: A conceptual frame and fundamental issues it highlights. Draft, 1994, 28.

[10] Townsend, G. C., "Turning liabilities into assets in a general education course," presented at Twenty-ninth SIGCSE technical symposium on computer science education, Atlanta, Georgia, 1998.