An exploratory study of program comprehension strategies of - - PDF document

an exploratory study of program comprehension strategies
SMART_READER_LITE
LIVE PREVIEW

An exploratory study of program comprehension strategies of - - PDF document

Int. J. Human-Computer Studies (2001) 54, 1 } 23 doi:10.1006/ijhc.2000.0423 Available online at http://www.idealibrary.com on An exploratory study of program comprehension strategies of procedural and object-oriented programmers C YNTHIA L. C


slide-1
SLIDE 1
  • Int. J. Human-Computer Studies (2001) 54, 1}23

doi:10.1006/ijhc.2000.0423 Available online at http://www.idealibrary.com on

An exploratory study of program comprehension strategies of procedural and object-oriented programmers

CYNTHIA L. CORRITORE College of Business Administration, Creighton University, Omaha, NE 68178, USA. email: cindy@creighton.edu. SUSAN WIEDENBECK College of Information Science and Technology, Drexel University, Philadelphia, PA 19104,

  • USA. email: susan@cse.unl.edu.

(Received 14 July 1999 and accepted in revised form 19 July 2000)

This exploratory study examines the nature of program understanding strategies em- ployed during a series of comprehension and maintenance activities carried out over

  • time. Two dimensions of comprehension were examined: the direction of comprehension

and the breadth of comprehension. Thirty expert procedural and object-oriented (OO) programmers studied a program and then performed modi"cations during two sessions held 1 week apart. The results showed that the direction of comprehension was mixed. The OO programmers tended to use a strongly top-down approach to program under- standing during the early phase of familiarization with the program but used an increasingly bottom-up approach during the subsequent maintenance tasks. The pro- cedural programmers used a more bottom-up orientation even during the early phase, and this bottom-up approach became even stronger during the maintenance tasks. The breadth of the programmers' comprehension was found to be greater for the procedural programmers than for the object-oriented programmers. However, after carrying out a series of tasks, all programmers had examined the majority of the program code. The results suggest that, regardless of paradigm, expert programmers eventually build a broad systematic, rather than a localized, view of a program over time. 2001 Academic Press

KEYWORDS: procedural programmers; object-orientated programmers; software maintenance; program comprehension.

  • 1. Introduction

Software maintenance, which involves making enhancements, adaptations and correc- tions to existing software systems, has been estimated to account for more than half of programmer time (Layzell, Champion & Freeman, 1993). Because of its large role in the software process, maintenance has a large e!ect of programming productivity, and even small gains in productivity have the potential for signi"cant economic e!ects (Henry & Humphrey, 1993). It has been argued that many of the problems in modern software development are related to the cognitive complexity of programming (Fisher, 1987).

1071-5819/01/010001#23 $35.00/0 2001 Academic Press

slide-2
SLIDE 2

Brooks (1983) and Koenemann and Robertson (1991) identify program comprehension, the understanding of program code, as a critical cognitive activity in programming. Successful program maintenance, in particular, depends on comprehension (Koenemann & Robertson, 1991; Rajlich, Doran & Gudla, 1994; von Mayrhauser & Vans, 1995a,b; Canfora, Mancini & Tortorella, 1996; Tilley, Paul & Smith, 1996). The programmer must have an adequate understanding of what a program does and how it does it in order to make functional modi"cations and extensions to a program without introducing errors. While there have been some experimental studies of programmers carrying out mainten- ance tasks (Littman, Pinto, Letovsky & Soloway, 1986; Pennington, 1987a; Koenemann & Robertson, 1991; von Mayrhauser & Vans, 1996, 1997), our understanding of program comprehension during program maintenance is still fairly sparse. We propose that the study of program comprehension in maintenance should begin with an examination of the key comprehension-related activities involved in mainten- ance: breadth and direction (Shneiderman & Mayer, 1979; Brooks, 1983; Letovsky, 1986; Koenemann & Robertson, 1991). There is a need for more knowledge about the strategies programmers use during maintenance to comprehend programs, how the use

  • f information sources changes over time, and what e!ect the programming paradigm

has on comprehension-related activities during maintenance. These issues are important

  • nes since program comprehension is the key foundation to good program maintenance.

In order to improve program maintenance, we must "rst understand the processes underlying this activity. This information could then be used to develop tools or training which support the programmer in these activities. In addition, such study will empirically examine the claims of OO programming advocates about the advantages the paradigm brings to program maintenance. We also designed this study as a follow up of previous research, extending earlier work by adding a longitudinal and paradigm components, and eliminating what we felt to be lacking in previous work. In this research we analyse in detail the knowledge sources used by expert program- mers during maintenance of a program in order to determine the direction of compre- hension and the breadth of comprehension. We analyse how comprehension-related activities proceed in repeated maintenance tasks and how the comprehension activities of procedural and object-oriented (OO) programmers di!er. The organization of this paper is as follows. Section 2 summarizes previous research

  • n program comprehension, with emphasis on comprehension during program mainten-

ance and similar activities. It then states our research questions. Section 3 outlines the methodology of the study, including the participants, materials and experimental pro-

  • cedure. The results are presented in Section 4. Section 5 discusses the results, and Section

6 contains concluding remarks.

  • 2. Previous research

2.1. DIRECTION OF COMPREHENSION

The direction of comprehension concerns the programmer's strategic approach to program comprehension, which may be top-down, bottom-up, or a combination of the

  • two. Shneiderman and Mayer (1979) and also Pennington (1987b) describe bottom-up

theories of program comprehension. In the Shneiderman and Mayer model program 2

  • C. L. CORRITORE AND S. WIEDENBECK
slide-3
SLIDE 3

comprehension begins with encoding individual lines of the program text in short-term

  • memory. An individual line is grouped together with other related lines via a chunking

process to form a semantic representation of a larger unit of code. Long-term memory aids in the understanding of individual lines and the formation of chunks by providing language-dependent syntactic knowledge and language-independent semantic know-

  • ledge. Bottom-up construction continues by recognizing semantic relationships among

chunks already constructed and joining these chunks into larger, higher-level chunks. In this iterative manner a complete program representation is formed consisting of di!erent levels of abstraction of the program. The Pennington (1987b) comprehension model describes two program abstractions formed by the programmer during comprehension. The program model is a low-level abstraction consisting of knowledge of operations at a level close to the surface of the program code and of control #ow relations representing the order of execution. This program-level representation is formed early during pro- gram understanding and is derived from the microstructure of the program text. The domain model is a higher-level abstraction consisting of knowledge of data #ow and functional relationships. This representation is formed after the program model. It is derived from knowledge in the program model together with knowledge of plans in the domain of the program. Pennington's (1987a,b) own studies of comprehension and maintenance support the existence of these dual abstractions and the emergence of the program model before the domain model. Studies by Bergantz and Hassell (1991) and Corritore and Wiedenbeck (1991) also tend to support Pennington's dual model in comprehension and maintenance. Brooks (1983) and Soloway and his colleagues (Soloway, Ehrlich, Bonar & Greenspan, 1982) present top-down theories of program understanding. In Brooks's model program understanding is hypothesis-driven. The programmer begins by making a general hy- pothesis about the program's function, based on information outside the program code, such as a title or brief description. This hypothesis leads the programmer to expect to "nd certain objects and operations in the program, resulting in another level of more speci"c hypotheses. At this point the programmer has concrete things to look for in the program code, so hypothesis veri"cation is attempted. The programmer scans the program text searching for beacons, which are typical indicators of the presence of a particular structure or operation. If such beacons are found, the programmer may conclude with high probability that the particular structure or operation is present. This strengthens the current hypothesis. However, if no beacons for the hypothesized struc- tures and operations are found, the programmer must study the program text more carefully and may ultimately have to revise or reject the hypothesis. Comprehension continues through rounds of successive re"nement of hypotheses and hypothesis veri"ca- tion until the whole program has been understood. While Brooks's theory has not been tested in its entirety, the use of code and naming beacons in comprehension during program study has been shown empirically (Wiedenbeck, 1986, 1991; Gellenbeck & Cook, 1991). The plan-based theory of Soloway and his colleagues (Soloway et al., 1982; Soloway & Ehrlich, 1984; Soloway, Adelson & Ehrlich, 1988) also describes comprehension as largely a top-down activity. Plans are schematic knowledge about how to carry out typical actions in a program. Strategic plans are global plans for solution of a problem; tactical plans are local strategies corresponding to lan- guage-independent speci"cations of algorithms; implementation plans are plans for the

PROGRAM COMPREHENSION STRATEGIES

3

slide-4
SLIDE 4

realization of a tactical plan in a particular language. In addition, discourse rules are programming conventions that govern how plans are expressed. Knowledge of plans and discourse rules is gained through experience and stored in long-term memory. Program understanding begins with the programmer hypothesizing a high-level program goal, then decomposing it into subgoals. The programmer brings to bear the stored plans and discourse rules in an attempt to satisfy the subgoals and ultimately the top-level goal. As in Brooks's model, modi"cation of the subgoals is required if they cannot be directly satis"ed by supporting evidence in the program. Using a program recall task, Soloway and his colleagues (cited above) present empirical support for the use of plans and discourse rules in program understanding. Mixed models of program comprehension have also been proposed, which combine bottom-up and top-down processes (Letovsky, 1986; von Mayrhauser & Vans, 1995b, 1996, 1997). These authors argue that programmers are opportunistic comprehenders. They switch #exibly from top-down to bottom-up comprehension strategies depending

  • n the situation. Shaft and Vessey (1995) and von Mayrhauser and Vans (1996)

propose that programmers use a top-down, goal-oriented, or hypothesis-driven, approach to comprehension when they are working in a familiar domain in which they recognize a large number of plans. On the other hand, when they are comprehending code that is new to them and in an unfamiliar domain, they use the bottom-up approach described by Pennington. That is, they begin by developing a program model consisting

  • f a control #ow abstraction and later form a domain model consisting of data #ow

and functional abstractions. In an industrial-size program consisting of tens or hundreds

  • f thousands of lines of code, switches from top-down to bottom-up comprehension

and vice versa may occur frequently within the same program because the programmer's state of knowledge about the domain varies in di!erent parts of the program. von Mayrhauser and Vans (1998) observed that the use of bottom-up vs. top-down approaches varies with the task, with more top-down activity in adaptive maintenance (e.g. porting software to a new platform) than in corrective maintenance (eliminating bugs). They found that it also varies with the familiarity of the programmer with the domain and the program at hand. Programmers with knowledge of the domain take a more top-down approach than do programmers with less domain knowledge. Furthermore, programmers with less accumulated language knowledge and knowledge

  • f the program spend more time in bottom-up (program model) comprehension (von

Mayrhauser & Vans, 1997). Koenemann and Robertson (1991) argue that in a larger program, comprehension activities are mostly top-down. In an empirical study of program maintenance they found that programmers generate hypotheses and use pro- cedure and variable names as beacons during comprehension. In their study bottom-up methods were only used in the case of failed hypotheses or in comprehending the local code directly relevant to a modi"cation task. Symbolic execution of the code was very rarely done. In summary, studies examining the direction of comprehension have not fully sup- ported one strategy over another. Support has been found for top-down, bottom-up, and mixed strategies. All of the studies reported thus far studied procedural programmers. Also, none of the studies were longitudinal in nature. We proposed to extend this work by examining the direction of comprehension in a longitudinal setting, and also by studying the e!ect of paradigm on comprehension direction. 4

  • C. L. CORRITORE AND S. WIEDENBECK
slide-5
SLIDE 5

2.2. BREADTH OF COMPREHENSION

The breadth of comprehension refers to the extent to which the programmer becomes familiar with all or most parts of a program during comprehension. In a study of comprehension during a maintenance task, Littman et al. (1986) observed two strategies with respect to breadth of comprehension: systematic and as-needed. Programmers using the systematic strategy attempted to gain a broad understanding of the program while carrying out modi"cations. Their goal was to understand the overall design of the

  • riginal programmer so that they could make their modi"cations "t with the existing
  • code. Programmers using the as-needed strategy (also known as the isolation or rel-

evance strategy) attempted to understand the minimum amount of code necessary to successfully carry out the modi"cation. They were not concerned with the overall design

  • f the program but with the functioning of selected local parts of the code. Littman et al.

found that programmers who took the systematic approach were more successful in carrying out modi"cation tasks. They attribute this greater success to the ability of the systematic approach to detect interactions of the code being modi"ed with code in other parts of the program. Such interactions often arise because of delocalized plans (Soloway, Pinto, Letovsky, Littman and Lampert, 1988), i.e. programming plans that are not implemented by contiguous lines of code but rather are distributed in non-contiguous parts of the program, for example, a single plan distributed over di!erent modules. Programmers using an as-needed strategy of comprehension were not aware of interac- tions and, as a result, tended to introduce errors into the program during modi"cation. While a systematic comprehension strategy appears to be most successful for making modi"cations to an existing program, there are serious questions about its scalability. The Littman et al. experiment, which showed the superiority of a systematic comprehen- sion strategy, was done using a program of some 200 lines of code. von Mayrhauser and Vans (1995a) argue that systematic comprehension may be impossible in a large program consisting of many thousands of lines of code. An experiment by Koenemann and Robertson (1991) began to address comprehension strategies in larger programs. Their program was approximately 600 lines and small by industry standards. However, it was large enough to observe more realistic comprehension behavior of programmers who were attempting to quickly familiarize themselves with a program in order to carry out modi"cations. Koenemann and Robertson found that comprehension was as-needed, and no participant used a systematic strategy. The breadth of comprehension was determined by the modi"cation task at hand. Browsing was used to "nd relevant procedure names, not to gain an overall familiarity with the program. No time was spent comprehending parts of the program judged irrelevant to the modi"cation tasks. On the

  • ther hand, von Mayrhauser and Vans (1996) report one instance of systematic line-by-

line study of an industrial program, but they do not indicate how large a portion of the program was studied in this manner. Previous work on the breadth of program comprehension indicates that an as-needed

  • r opportunistic strategy might be the approach taken by professional procedural

programmers dealing with programs of any signi"cant size. While most of the previous work has been done with small programs and all with procedural programmers, we chose to focus on the e!ect of a more realistic program and task, along with the e!ect of paradigm, on breadth. Thus, we address what we identi"ed as possible gaps in previous works on breadth of comprehension.

PROGRAM COMPREHENSION STRATEGIES

5

slide-6
SLIDE 6

2.3. PROCEDURAL AND OBJECT-ORIENTED PROGRAM COMPREHENSION

More of the advantages that have been ascribed to the object-oriented paradigm concern design and reuse activities. However, Rosson and Alpert (1990) and von Mayrhauser and Vans (1996) suggest that there may also be advantages for program comprehension if OO programs a!ord a more top-down, domain-oriented style of comprehension. von Mayrhauser and Vans (1996) call for the extension of research on comprehension during program maintenance to programs written in the object-oriented paradigm. Results of some existing studies of program comprehension in the OO paradigm give indications of where we may "nd di!erences between the procedural and object-oriented paradigms in comprehension-based tasks. von Mayrhauser and Vans (1996) speculate that OO comprehension might be charac- terized by a more top-down approach compared to procedural comprehension, as has been found in OO design (Lee & Pennington, 1994; Pennington, Lee & Rehder, 1995). As suggested by the work of Gilmore and Green (1984), this would be likely to occur if OO code and documentation tend to highlight higher-level abstractions more than do procedural code and documentation. Experimental results of Burkhardt, De H tienne & Wiedenbeck (1997) which employed program documentation and reuse tasks show that OO experts tend to develop a domain-based abstraction in terms of function, objects and the relationships of objects during program comprehension. It may be argued that

  • bjects and the high-level functionality and relationships of objects are indeed high-

lighted by the OO notation, so this is consistent with Gilmore and Green's argument that highlighting information in a notation makes it more available. Such a strategy would also be supported by the encapsulation feature of the OO paradigm. Further results (Burkhardt, De H tienne & Wiedenbeck, 1998) from the same study indicate that OO experts use more top-down behavior than do OO novices, while the breadth of compre- hension of OO experts and novices is similar. Corritore and Wiedenbeck (1999) com- pared OO and procedural experts directly in program comprehension during program modi"cations. They found that there were no di!erences in comprehension of function and data #ow. However, OO programmers initially developed stronger knowledge of program structure including the relationships of program objects, but poorer program- level knowledge of speci"c operations and control #ow. This may indicate a more top-down approach among OO programmers. Nevertheless, after carrying out program modi"cations, OO programmers in the Corritore and Wiedenbeck study had developed program-level knowledge equal to that of the procedural programmers. With respect to maintenance activities in the OO and procedural paradigms, a few empirical studies exist. Henry and Humphrey (1993) found that modi"cations to OO programs were more local, i.e. involved editing of fewer modules, than modi"cations to the corresponding procedural programs. This might indicate that it is possible to successfully make changes in an OO program using a local, as-needed comprehension

  • strategy. However, this hypothesis was not tested in the research. Similarly, Lange

and Moher (1989) identi"ed the use of a comprehension avoidance strategy in an OO code reuse task. Daly, Brooks, Miller, Roper and Wood (1996) studied the maintainabil- ity of OO programs as a function of the depth of the inheritance hierarchy. They found that a deeper hierarchy (i.e. "ve-level structures compared to three-level structures and #at structures) led to problems in making modi"cations. These problems appear to be comprehension-related. For example, programmers had di$culties tracing 6

  • C. L. CORRITORE AND S. WIEDENBECK
slide-7
SLIDE 7

the inheritance hierarchy, understanding virtual functions, and choosing a class to use as a copy template in the deeper structure. Briand, Bunse, Daly and Di!erding (1997) studied the maintainability of OO and structured design documents. They found that adhering to good OO design practices provides ease of understanding and modi"cation. Furthermore, OO design documents were more sensitive to design practices than were structured design documents. The authors attribute this result to higher cognitive complexity of the OO paradigm. Previous work indicated that OO comprehension might tend to have a top-down direction due in part to the nature of the OO notation. Modi"cations might also tend to be very localized, supporting an as-needed breadth strategy. However, such research is very limited. In addition, little to no comparisons of program comprehension in pro- cedural vs. OO programmers has been conducted. Thus, we extend previous work which has examined procedural or OO paradigms in isolation by examining them in this study side by side.

2.4. RESEARCH QUESTIONS

In this research, we compare the strategies of expert programmers during comprehen- sion and maintenance of OO and procedural programs. Two dimensions of comprehen- sion strategy are considered: direction of comprehension and breadth of comprehension

  • ver repeated study and maintenance episodes. We state three research questions:
  • 1. Do OO experts show a more top-down direction of comprehension than pro-

cedural experts? The OO style with its emphasis on objects, their hierarchical relationships and their functionality may itself facilitate a more top-down, domain-

  • riented strategy of program understanding than the procedural style. If so, we

should expect to see more evidence of top-down strategies among OO program- mers.

  • 2. Do OO experts have a narrower breadth of comprehension than procedural

experts? A possible advantage of the OO paradigm is that objects contain both data and functionality with encapsulation. Changes to the functions of an object are internal to the object. This may mean that program modi"cations can be done successfully at a more local level with correspondingly less broad knowledge of the program.

  • 3. Do comprehension strategies change over repeated modi"cation of the same

program? Programmers normally maintain a program over a period of time, making successive modi"cations. A narrow breadth of comprehension in initial maintenance of a program may incrementally become a wide breadth, as a pro- grammer works with a program over time.

  • 3. Research method

3.1. PARTICIPANTS

Thirty programmers participated in the study. Fifteen were procedural C experts and 15 were object-oriented C # # experts. All of the procedural experts were employed professionally as C programmers in industry. Likewise, all of the C # # experts were

PROGRAM COMPREHENSION STRATEGIES

7

slide-8
SLIDE 8

employed professionally as C # # programmers in industry. The average age of the participants was 30 years. Two were female and 28 were male. One participant's highest academic degree was a Ph.D., 12 held Master's degrees, 15 Bachelor's degrees and two a high school diploma. Twenty-two held their highest degree in computer science, "ve in engineering and the rest in a variety of "elds. On average, participants had been programming for 11.6 years with a range of 2.5}20 years. They had worked with an average of 10 programming languages, could write a simple program in six languages, and had experience with seven platforms and eight operating systems. The average length of full-time employment as a system analyst or programmer was 7 years. Statistical analysis showed no signi"cant di!erences between the C and C # # groups with respect to any of the demographic variables or programming experience variables reported above. However, a known di!erence between the C and C # # groups was their respective amounts of programming experience with the procedural and OO paradigms. The procedural participants were familiar with OO concepts and most had some experience using OO languages. However, they had not programmed extensively in an OO language. On the other hand, the OO participants did have previous experience programming in procedural languages in addition to their object-oriented experience. This di!erence between the groups results from the fact that today most programmers with substantial experience in industry began their careers working with procedural languages and only made the transition to the use of object-oriented languages in more recent years. Thus, it was not feasible for us to choose OO programmers who had no procedural experience.

3.2. MATERIAL

C and C # # were chosen to represent the procedural and OO paradigms, respectively, because the language notations are the same except for the speci"cally object-oriented features of C # #. This made it less likely that extraneous di!erences between the procedural and OO languages would a!ect the results. Two functionally equivalent versions of a database program were written in procedural C and object-oriented C # #. The program manipulated records of passengers, crew members and #ights of a small

  • airline. This database problem was chosen because it did not require any highly

specialized domain knowledge. Furthermore, the characteristics of a database program made it well suited for implementation in both paradigms. The C## program was divided as is typical into header and implementation "les. The header "les each contained the class declarations, and the corresponding implemen- tation "le contained the function bodies. Likewise, the C program was divided into typical C header and implementation "les. The header "les contained the function prototypes and the structure declarations. The implementation "les contained the function bodies. Both programs and their supplementary materials were as equivalent as possible, while at the same time prototypical of their paradigm. The C # # program represented complex data entities through classes and made extensive use of the OO features of inheritance, composition, encapsulation, and polymorphism. The procedural program was written in a structured and modular manner. It implemented complex data entities through structure variables. Both programs used array and list data structures. In the 8

  • C. L. CORRITORE AND S. WIEDENBECK
slide-9
SLIDE 9

C # # program, the functions were methods attached to classes, while in C stand-alone functions were used. The same underlying algorithms were used in corresponding functions of the C and C # # program (e.g. for searching, sorting, string processing, etc.). The two programs also used the same or similar variable names whenever possible. Both programs were similar in length, although due to the overhead required for implemen- ting classes, the C # # program was slightly longer than the C program (822 vs. 783 lines). The programs were examined by two C and two C # # experts, respectively, for typicality and the degree to which the program and materials followed standard conven- tions for their paradigm. Modi"cations were made based on the evaluations. Comments were not included in the source code of either program; however, extensive external documentation was provided. Documentation to supplement the source code included program summaries, descriptions of the functionality of program modules, charts of inheritance hierarchies, charts of calling structures and diagrams of data

  • structures. Our documentation may have been more thorough than that in a typical

program, but this was intentional. Our aim was to also make non-code sources of information about the program available in addition to those based on the code itself. This would allow us to study the information preferences of the programmers during comprehension given multiple sources of information about the program. Additionally,

  • ur documentation was also modeled after that used by Koenemann and Robertson

(1991) in their study of comprehension strategies. The materials were designed to be functionally equivalent for the two paradigms, although they were not identical (the most notable di!erence being that there were no inheritance hierarchy charts for the procedural program). These supplementary mater- ials were also inspected by C and C # # experts who veri"ed that, in so far as possible, the materials contained the same kinds and depth of information and adhered to the same format. Modi"cations were made based on the expert's critiques. Three program modi"cations were designed. An example of a modi"cation is adding a new employee category to the database. The functional descriptions of the modi"ca- tions were identical for the C and C # # groups. The modi"cations were designed to be similar in the C and C # # programs with respect to the level of di$culty of the solution and the amount of code required to be added or altered in each program. However, they were designed to elicit di!erent types of solutions in the two paradigms. One of them was designed so that it could be implemented using a delocalized solution in both paradigms. The other modi"cation was designed to have a classic object-oriented solution which involved using a new instance as an existing class. It took full advantage of the claim of the object-oriented paradigm to simplify program modi"cation. The task required a slightly more distributed solution in the procedural paradigm.

3.3. PROCEDURE

Each participant was run individually in two 2-h sessions that were held 7}10 days apart (see Table 1). In the beginning of session 1 participants familiarized themselves with the programming environment and the research procedure by completing a short hands-on exercise which involved making modi"cations to a small program di!erent from the experimental database program. The remainder of the "rst session was designed to simulate the "rst exposure of a programmer to a new program. We wanted to model the

PROGRAM COMPREHENSION STRATEGIES

9

slide-10
SLIDE 10

TABLE 1 Description of methodology procedures

Session 1: Study Session 2: Modi"cations

Practice w/sample program Complete modi"cation 1 Study period Complete modi"cation 2 Short modi"cation task

situation in which the programmer has some familiarity with the program already. We were not trying to model a situation in which the programmer had no previous experience with the program. Similar to other program comprehension studies, our "rst session contained a study period of up to 30 min, during which participants attempted to understand the database program (Pennington, 1987a,b; Corritore & Wiedenbeck, 1991; Burkhardt et al., 1998). They then completed a short modi"cation task. This "rst modi"cation was used to motivate the study of the program, since study of a program with no objective does not

  • ccur commonly in a work setting. This addition was made based on feedback during

pilot testing. Participants then completed two program modi"cations during the second

  • session. Participants were advised to complete the modi"cations in an e$cient manner.

Our pilot testing showed that the two modi"cations took di!erent amounts of time, and participants were informed of the maximum time allowed for each modi"cation. The

  • rder of presentation of the two modi"cations was counterbalanced. The activities of the

participants during the study period in session 1 and the two modi"cations in session 2 provided the experimental data. For each modi"cation, more than one correct solution was possible. The program and all supplementary materials were presented on-line in a graphical Unix environment created for the study. The on-line environment provided a more natural environment for professional programmers than the pencil and paper environ- ment used in many studies. The environment supported standard editing features such as cut and paste as well as compilation and running of the program. One restriction of the environment was that only one document (segment of program code or supplementary documentation) was visible at a time and they were not allowed to print "les. These restrictions allowed us to determine unambiguously what materials participants ac-

  • cessed. This could mean a subject would access a "le multiple times, since they could not

leave a "le open. However, this did not a!ect the results reported as multiple accesses to a given "le were only counted once. None of the participants had prior experience with the environment. All participants had access to a help sheet for the environment and a standard C or C # # reference book during the experiment, but made little use of

  • either. Screen capture software was utilized to record participants' work as they studied

the program materials and made modi"cations to the program.

  • 4. Results

We examined two dimensions of comprehension: direction of comprehension and breadth of comprehension. In this study, we operationalized the direction of comprehen- 10

  • C. L. CORRITORE AND S. WIEDENBECK
slide-11
SLIDE 11

sion as the level of abstraction of the program entities accessed over time. Our program materials were at three levels of abstraction, which corresponded to three "le types that we provided to the participants: documentation "les, header "les and implementation "les. Documentation "les were the most abstract, consisting of the external documenta- tion of the program. Header "les (.h "les in C and C # # terminology), containing the declarations of data entities and functions, were at an intermediate level of abstraction. Implementation "les (.cc "les in C and C # # terminology), containing the code imple- mentation of functions, were at the lowest level of abstraction. Consistent with the approach of Burkhardt et al. (1998), we considered accessing the more abstract documen- tation and header "les to re#ect top-down processes and accessing the less abstract implementation "les to re#ect a bottom-up strategy. A mixed strategy would be in- dicated by use of "les at di!erent levels of abstraction. We expected to see a more top-down strategy of comprehension being used by the OO programmers. The breadth of comprehension refers to whether the comprehension of the program is broad in scope or narrowly focused on certain parts of the program. A systematic strategy of comprehen- sion would be indicated by a broad study of all of the available program materials. In contrast, an as-needed strategy would be indicated by limitation of the study of program

  • materials. In contrast, an as-needed strategy would be indicated by limitation of the

study of program materials to only a small part of those available. Only information judged by the participant to be relevant to the task at hand would be examined. Hence, participants would be likely to access fewer of the program materials. In keeping with this characterization of the systematic and as-needed strategies, in this study breadth of comprehension was operationalized as the proportion of "les accessed. We expected to see a narrower breadth of comprehension by the OO programmers in part because of the OO paradigm encapsulation. The data collected by the screen capture program were replayed in order to identify which "les participants accessed. The data analysed were the proportion of documenta- tion, header and implementation "les accessed out of the total for each "le type. Analysis

  • f variance was utilized to examine di!erences between procedural and OO participants

with respect to the proportion and type of "les accessed across the di!erent experimental

  • tasks. Follow-up analysis was conducted using ANOVA and Tukey's HSD. This study is

exploratory because of the small amount of research comparing comprehension strat- egies of procedural and object-oriented programmers (von Mayrhauser & Vans, 1996). In addition, because our participants were professional programmers, the pool of potential programmers for use in the study was limited. This resulted in a small sample size that reduces statistical power. As a result of these considerations, we set 0.10 as our alpha level. We carried out follow-up analysis on the ANOVAs if the p-value was less than 0.10 and the e!ect size () was moderate or large by the Cohen (1977) guidelines (i.e. an of less than 0.06 is small, between 0.06 and 0.14 is moderate, and greater than 0.14 is large). The "rst analysis done was a three-way, mixed-model analysis of variance. The between-subjects factor was programming paradigm (procedural or object-oriented). The within-subjects factors were "le type (documentation, header or implementation) and activity (program study, "rst modi"cation or second modi"cation). The dependent variable was the mean proportion of "les accessed, and thus the range of the means was 0}1. Figures 1 (a)}(c) show graphically for each "le type the proportion of "les accessed

PROGRAM COMPREHENSION STRATEGIES

11

slide-12
SLIDE 12

FIGURE 1. (a). Proportion of implementation "les accessed by procedural and object-oriented groups during each activity, (b). Proportion of header "les accessed by procedural and object-oriented groups during each activity, (c). Proportion of documentation "les accessed by procedural and object-oriented groups during each activity. : Object-oriented; : procedural.

by activity. Alternately, Figures 2 (a)}(c) illustrate the proportion of "le accesses during each activity by "le type. The numeric means and standard deviations (S.D.S) are listed in Tables 2}4. The mean proportions of "les accessed, as listed in Table 4, are displayed graphically in Figures 3 (a)}(c). The ANOVA showed a signi"cant main e!ect of paradigm [F(1,28)"4.03, p(0.05], with procedural participants accessing more "les than OO participants overall. Thus, the breadth of the procedural participants was greater than that of the OO participants. This relationship had an value of 0.13, which according to Cohen (1997) indicates a moderate e!ect size or a moderate strength

  • f

relationship. There was also a signi"cant main e!ect

  • f

"le type [F(2,56)"7.89, p(0.001], with an value of 0.22, indicating a large e!ect size or 12

  • C. L. CORRITORE AND S. WIEDENBECK
slide-13
SLIDE 13

FIGURE 2. (a). File accesses during study period by "le type and paradigm, (b). File accesses during "rst modi"cation by "le type and paradigm, (c). File accesses during second modi"cation by "le type and paradigm. !!: Procedural; !!: object-oriented.

TABLE 2 Mean proportions and S.D.s (in parentheses) of ,les accessed by procedural and

  • bject-oriented groups in the three activities

Study period First Second All modi"cation modi"cation Procedural 0.76 (0.18) 0.48 (0.14) 0.43 (0.14) 0.56 (0.08) OO 0.74 (0.19) 0.44 (0.17) 0.33 (0.17) 0.50 (0.07) All 0.75 (0.18) 0.46 (0.16) 0.38 (0.16) 0.53 (0.08)

strong practical relationship with the independent variable, proportion of "le accessed. Follow-up analysis using Tukey's HSD showed that, when considering all participants and activities together, a higher proportion of implementation and header "les was accessed than documentation "les (p(0.05), but there was no signi"cant di!erence between accesses of header and implementation "les. The frequent consultation of the

PROGRAM COMPREHENSION STRATEGIES

13

slide-14
SLIDE 14

TABLE 3 Mean proportions and S.D.s (in parentheses) of ,les accessed by procedural and

  • bject-oriented groups for the three ,le types

Documentation Header Implementation All "les "les "les Procedural 0.37 (0.28) 0.60 (0.13) 0.70 (0.11) 0.56 (0.08) OO 0.48 (0.19) 0.49 (0.16) 0.54 (0.12) 0.50 (0.07) All 0.43 (0.24) 0.55 (0.15) 0.62 (0.14) 0.53 (0.08)

TABLE 4 Mean proportions and S.D.s (in parentheses) of ,les accessed by procedural and

  • bject-oriented groups for the three ,le types in the three activities

Study Period First Second Modi"cation Modi"cation Doc. Header Impl. Doc. Header Impl. Doc. Header Impl. "les "les "les "les "les "les "les "les "les Procedural 0.62 0.81 0.85 0.38 0.43 0.62 0.12 0.56 0.63 (0.44) (0.17) (0.25) (0.39) (0.30) (0.16) (0.19) (0.35) (0.18) OO 0.88 0.70 0.64 0.37 0.44 0.52 0.18 0.34 0.47 (0.16) (0.31) (0.26) (0.36) (0.21) (0.22) (0.28) (0.29) (0.21)

less abstract "le types suggests that globally the direction of comprehension was bottom-up. The main e!ect of activity was also signi"cant and had a large e!ect size [F(2,56)"34.09, p(0.001; "0.55]. Follow-up analysis with Tukey's HSD indicated that all participants accessed more "les during the study period than during the "rst or second modi"cations (p(0.05). Thus, when considering all participants and "le types together, the breadth of the comprehension activity was greatest during the study period. There were signi"cant two-way interactions of paradigm and "le type [F(2,56)" 3.94, p(0.03, "0.12] and

  • f

"le type and activity [F(4,112)"6.89, p(0.001, "0.20]. However, the paradigm by activity interaction was not signi"cant, although there was a three-way paradigm by activity by "le-type interaction (see below). Follow-up testing for the paradigm by "le-type interaction using Tukey's HSD failed to "nd any signi"cant di!erences. This might be explained by low statistical power resulting from the small number of subjects. However, it appears that there was a trend for procedural participants to access more implementation than documentation "les over all three activities combined. Follow-up on the "le type by activity interaction using Tukey's HSD showed that all participants accessed documentation "les more during the study period than in the "rst and second modi"cations (p(0.05). They also accessed more 14

  • C. L. CORRITORE AND S. WIEDENBECK
slide-15
SLIDE 15

FIGURE 3. (a). Mean proportions of "les accessed during study by procedural and object-oriented groups for "le types, (b). Mean proportions of "les accessed during Modi"cation 1 by procedural and object-oriented groups for "le types, (c). Mean proportion of "les accessed during Modi"cation 2 by procedural and

  • bject-oriented groups for "le types. : Procedural;

: object-oriented.

header "les during the study period than during the "rst and second modi"cation tasks (p(0.05). During the second modi"cation, participants accessed documentation "les signi"cantly less than implementation "les (p(0.05). These "ndings suggest that, re- gardless of paradigm, the direction of comprehension was more top-down during program study, as indicated by the greater use of the more abstract documentation and header "les during the study period.

PROGRAM COMPREHENSION STRATEGIES

15

slide-16
SLIDE 16

The three-way interaction of paradigm, activity, and "le type was also signi"cant [F(4,112)"2.35, p(0.058, "0.08]. In the follow-up analysis we examined each activity separately using two-way mixed model analysis of variance with paradigm as the between subjects factor and "le type as the within subjects factor. For subsequent pairwise comparisons Tukey's HSD was used. The results of these tests are presented in the following three paragraphs. During the study period, the main e!ects of paradigm and "le type were not signi"cant, but there was a signi"cant interaction of paradigm and "le type [F(2,56)"7.28, p(0.01] with an of 0.21. This indicates that there is a strong relationship between paradigm and "le type during the study period. Pairwise comparisons showed that procedural partici- pants accessed signi"cantly more implementation "les during the study period than did OO participants (p(0.05). In contrast, OO participants accessed signi"cantly more documentation "les than did procedural participants during the study period (p(0.05). Within the procedural group, implementation and header "les were accessed signi"- cantly more than documentation (p(0.05). However, the OO participants accessed documentation "les more than either header or implementation "les during the study period (p(0.05). These results of "le types accessed suggest that, during the study period, procedural participants tended to use a more bottom-up direction of comprehen- sion, while OO participants employed a more top-down direction of comprehension. For the xrst modixcation task, a signi"cant main e!ect of "le type was found [F(2,56)"3.56, p(0.04, "0.11], but the main e!ect of paradigm was not signi"cant, nor was the paradigm by "le-type interaction. Pairwise comparisons showed that, regardless of paradigm, participants accessed more implementation "les than documen- tation "les during the "rst modi"cation (p(0.05). It appears that participants found relatively little use for documentation during the "rst modi"cation, a trend that con- tinued and intensi"ed in the second modi"cation. This suggests a move to a more bottom-up comprehension strategy during the modi"cation tasks. During the second modixcation task, signi"cant main e!ects were found for "le type [F(2,56)"20.88, p(0.001, "0.43] and paradigm [F(1,28)"3.39, p(0.08, "0.11]. Procedural participants accessed more "les overall during the second modi"- cation then did their OO counterparts. Follow-up testing on the main e!ect of "le type using Tukey's HSD indicated that all participants accessed more implementation "les than header and more header "les than documentation "les during the second modi"ca- tion (p(0.05). The ANOVA also identi"ed a signi"cant two-way interaction of para- digm and "le type [F(2,56)"2.75, p(0.07, "0.09]. Pairwise comparisons revealed that procedural participants accessed more implementation and header "les than docu- mentation "les during the second modi"cation. In fact, their accesses to documentation "les dropped to very low levels. Likewise, OO participants accessed more implementa- tion "les than documentation "les during the second modi"cation, also showing a strong drop in accesses to abstract information. It appears that both groups were working in a bottom-up manner at this point. However, procedural participants still tended to have a wider breadth of comprehension than OO participants do, as shown by their larger

  • verall proportion of "les consulted in the second modi"cation.

Breadth of comprehension was also examined as de"ned by Koenemann and Rober- tson (1991), who operationalized breadth as the percentage of program functions and lines of code accessed. For counting purposes, Koenemann and Robertson made the 16

  • C. L. CORRITORE AND S. WIEDENBECK
slide-17
SLIDE 17

simplifying assumption that a person who accessed a given "le accessed all the lines of code within it. This analysis of Koenemann and Robertson di!ers from our previous analysis of proportion of "les accessed in that one implementation "le usually contains multiple functions. For comparative purposes, we also carried out an analysis of accesses to functions and lines of code similar to Koenemann and Robertson's. This "ner analysis

  • f the breadth of access to the function implementations is possible because our data

capture program recorded accesses not just to "les but also to individual functions. For this analysis we combined the two modi"cations for comparison to Koenemann and Robertson, who report a single "gure. Procedural participants in our study examined an average of 59% of the functions and 64% of the total lines of code during the study period and 40% each of the functions and lines of code during the modi"cations. OO participants accessed an average of 40% of functions and 46% of lines of code during the study period and 29% of functions and 36% of lines of code during the modi"cations. These rates were much higher than those reported by Koenemann and Robertson for modi"cation of a procedural program by expert procedural programmers. Their re- ported access percentages were 20% for the program functions and 28% for the lines of code in their one modi"cation task (no study period was used). This is in spite of the fact that our study used a program that was about 25% longer. Data were further analysed according to the percentage of functions and lines of code whose existence was not known by participants. As described by Koenemann and Robertson, a function was classi"ed as not known if a participant had not accessed the function itself and had not accessed any functions that called that function. According to this criterion, OO participants in our study did not know of the existence of 17% of the existing functions after their initial study of the program, while procedural participants did not know of the existence of 13% of the functions. (However, in the case of the OO participants, several of these `unknowna functions were likely to be constructors that are not called explicitly but that the OO participants would know were contained in a given class by de"nition.) After the two modi"cation tasks in session 2, the percentage of functions whose existence was unknown declined sharply to 2}3% in both the pro- cedural and the OO groups. Similarly, the same kind of longitudinal trend was seen in the percent of lines of code of which the participants were unaware. Koenemann and Robertson, in contrast, reported that 24% of the functions in their program remained unknown to participants after performing a single modi"cation.

  • 5. Discussion

Our results can be summarized in terms of direction and breadth of the comprehension

  • strategy. The direction of comprehension was mixed with substantial use of top-down

and bottom-up strategies. However, on the whole the bottom-up orientation was more

  • prominent. This was re#ected by higher global access rates to implementation and

header "les relative to documentation "les. However, documentation "les were used heavily during the study period, especially by the OO participants. In the study period OO participants appeared to be using a top-down strategy focusing strongly on docu- mentation; for example, they accessed about 90% of the documentation "les during the study period but only 60}70% of the header and implementation "les. OO programmers may focus on documentation early in comprehension as the key information sought are

PROGRAM COMPREHENSION STRATEGIES

17

slide-18
SLIDE 18

domain objects and their relationships. This information is typically clearly presented in OO documentation. Procedural participants, on the other hand, employed a more bottom-up strategy during the study period, accessing slightly over 60% of the documentation "les but 85}90% of the header and implementation "les. The approach

  • f the procedural group continued to be bottom-up during the modi"cation tasks

with higher accesses to the implementation and header "les than to the documentation "les. Their use of the documentation "les decreased consistently across the three

  • activities. Compared to the procedural group, the OO group showed a pattern in which

their strategy changed more over time with the activity. While initially they used a top-down approach to comprehension, later their strategy shifted towards a more bottom-up orientation as they made program modi"cations. This shift became more pronounced as time progressed, and was most clearly shown by the sharp decline in their documentation use in the second modi"cation. It appears that the more abstract "les provided the information needed for general comprehension of the program, but once the modi"cations began the OO programmers shifted to a more bottom-up orientation. Thus, the modi"cation tasks focused the information needs of the programmers on the

  • code. While the same decline in the use of abstract information occurred in the pro-

cedural group, it was less pronounced because external documentation was used less from the start. The use of both bottom-up and top-down strategies and the change in their relative use over time imply that the type of activity of the programmer and the accumulation of experience with the program play important roles in the direction of comprehension. In an initial phase of familiarization with the program more top-down behavior is evident. However, after the program is somewhat familiar and given the motivation of a modi"ca- tion task, abstract descriptions are less useful, so the behavior can be described as bottom-up. Furthermore, the programmer continuously gains greater familiarity with the program while carrying out modi"cation tasks. The greater knowledge of the program results in even less need for abstract documentation in later modi"cation tasks. This agrees with the results of von Mayrhauser and Vans (1996). It suggests that theories

  • f program comprehension need to take into account both the task motivation and the

programmer's longitudinal experience with the program. They may also need to take into account the programming paradigm, as suggested by our result that the OO group took a more top-down approach during initial familiarization with the program than did the procedural group. Previously, Burkhardt et al. (1998) found a top-down direction in OO experts. Our results agree with the Burkhardt et al. when considering the early stage

  • f familiarization with a program, but suggest a change to a bottom-up strategy with

experience with a given program. While one might question whether the study period facilitated programmers using a more top-down strategy in the "rst modi"cation, we submit that even if this were true, we would expect to see the same progression of strategies over time. That is, a progression would still be seen from the use of abstract information re#ecting a top-down approach to an approach favoring less abstract, bottom-up information. Globally the breadth of comprehension of participants in this study tended to be quite

  • wide. We did not "nd support for the use of a highly localized strategy, as postulated by

Koenemann and Robertson (1991). They hypothesized that participants only look at code or documentation if they believe it is directly relevant to the task at hand. Our 18

  • C. L. CORRITORE AND S. WIEDENBECK
slide-19
SLIDE 19

"ndings of high percentages of "le accesses and knowledge of almost all functions by the end of the second modi"cation suggest a broader comprehension strategy. These observations must, however, be modi"ed by the changes over the course of our experiment and the di!erences between the procedural and OO groups. In terms of the time course of the experiment, participants generally showed a greater breadth of information gathering while initially studying the program, as shown by the relatively high levels of consultation of all "le types. During the modi"cations, the breadth became narrower and more focused, as shown by lower "le access rates. Nevertheless, comparing

  • ur procedural participants to those of Koenemann and Robertson (1991), the breadth in
  • ur study remained considerably greater: familiarity with 40% of the functions and 40%
  • f the lines of code after the modi"cations in our study, as opposed to 20% of the

functions and 28% of the lines of code in the Koenemann and Robertson study. This discrepancy might be related to di!erences in the length of the programs and complexity

  • f the tasks. Another factor with the potential to have a!ected the breadth of comprehen-

sion is the longer time frame of our study or the use of a warm-up period to familiarize the programmer with the program. Participants in our study made modi"cations to a program that they had studied and modi"ed a week or 10 days earlier. As a result of the time lapse, there may have been some study behaviour occurring as they revisited the program to make the experimental modi"cations in the second session. In industry, maintenance programmers return to code that they have previously modi"ed over time, so the need to refamiliarize oneself with a program is ongoing. While the di!erences were not extreme, in our experiment the procedural group tended to exhibit a greater breadth of comprehension than the OO group. The di!erence in percentage of functions and lines of code visited during the study period is indicative of this: 59% of the functions and 64% of the lines of code for the procedural group, but 40%

  • f both functions and lines of code for the OO group. The greater breadth of the

procedural group is also seen in the functions and lines of code visited in the modi"ca- tions and the signi"cantly greater "le accesses by the procedural group in the second modi"cation. Two interpretations are possible for this result. First, with respect to the study period, the direction and breadth strategies may interact. That is, the top-down approach of the OO participants in the study period may have given them a su$ciently broad overview of the program, derived from the abstract documentation "les, without consultation of a large number of less abstract "les. Second, with respect to the later program modi"cations, the OO group may have had a narrower breadth because the encapsulation of the OO paradigm facilitated the OO participants in more narrowly focusing their e!orts. Such an e!ect of encapsulation in the OO paradigm is predicted in the OO literature (Rentsch, 1982; Cox, 1986; Jacky & Kalet, 1987; Gwinn, 1992) and was

  • bserved in an experiment involving student programmers (Henry & Humphrey, 1993).

Although our result is consistent with this argument, the di!erences between the OO and procedural groups in breadth were relatively small. Further experimental study of this issue is needed.

  • 6. Conclusion

This paper provides empirical data about two characteristics of the comprehension strategies used by procedural and OO programmers over repeated exposure to the same

PROGRAM COMPREHENSION STRATEGIES

19

slide-20
SLIDE 20
  • program. Evidence was found that the comprehension strategies of procedural and OO

programmers di!er with respect to direction and breath of comprehension. However, the di!erences between the two groups were most pronounced during initial program comprehension activities and were less evident during the modi"cation phase. We argue that both groups employed a comprehension strategy with a predominantly bottom-up

  • rientation when making modi"cations to their programs over time. However, OO

programmers tended to utilize a top-down orientation during their initial program comprehension, while procedural programmers were more strongly bottom-up even during this early phase. Both groups employed a wide breadth of comprehension over the course of the study. While the breadth tended to be greater during the study period, and narrowed over the course of the modi"cations, it still remained relatively broad in both groups of programmers. The breadth was more pronounced in the procedural than OO programmers, particularly during the study period. This was in contrast to "ndings

  • f a narrow scope of comprehension by Koenemann and Robertson (1991). The di!er-

ence may be due to methodological di!erences such as the longitudinal approach or the inclusion of a study period in our experiment. From a practical perspective, we believe that the "ndings of this study begin to de"ne the processes and strategies of comprehension that procedural and OO programmers

  • employ. Such information would be useful in several ways. First, it could be used to guide

the development of program maintenance tools. For example, as indicated by our results, professional procedural and OO programmers appear to use di!erent strategies of

  • comprehension. A program maintenance tool would need to take this di!erence into

account, providing di!erent approaches for programmers using di!erent paradigms. Another example is that documentation appears to be important in early stages of program comprehension in OO programmers. Making such information easily available early on would support the comprehension process. However, it could be then relegated to a more minor position as comprehension progressed. Another application of such information would be for training. Program maintenance training could use the strat- egies uncovered by this study for development of training content related to how to comperhend programs for maintenance. This could reduce training time by presenting existing strategies rather than taking the time for programmers to happen upon useful strategies on their own in a more exploratory learning environment. These strategies could also be `taughta or demonstrated to non-expert programmers. We believe that this study has only begun to examine the e!ect of paradigm on

  • comprehension. More focused research into the di!erences in comprehension between

OO and procedural programmers is needed. Two complementary kinds of studies are called for: (1) laboratory studies using programs as large as possible for an experimental study, along with a longitudinal approach of repeated exposure to the program and (2)

  • bservational studies of programmers comprehending large programs in industry. As

this study was exploratory in nature, one of our objectives was to identify trends that point towards areas for further detailed study. Our study suggests several interesting avenues for further research. One such area, made increasingly possible by evolutionary developments in computer science education and the computer industry, is a replication

  • f this study with OO experts who do not have extensive procedural training or
  • experience. This would address the potential e!ect of transfer and interference in the

comprehension strategies used by our OO participants. The e!ect of time spent working 20

  • C. L. CORRITORE AND S. WIEDENBECK
slide-21
SLIDE 21

with a program on the breadth of comprehension could be further investigated by varying the time variable and investigating the e!ect on comprehension strategy. Repli- cation of this study with di!erent program materials is also necessary to evaluate the generality of the "ndings.

References

BERGANTZ, D. & HASSELL, J. (1991). Information relationships in PROLOG programs: how do programmers comprehend functionality? International Journal of Man}Machine Studies, 35, 313}328. BRIAND, L. C., BUNSE, C., DALY, J. W., & DIFFERDING, C. (1997). An experimental comparison of the maintainability of object-oriented and structured design documents. Empirical Software Engineering, 2, 291}312. BROOKS, R. (1983). Towards a theory of the comprehension of computer programs. International Journal of Man}Machine Studies, 18, 543}554. BURKHARDT, J. -M., DE

D TIENNE, F., & WIEDENBECK, S. (1997). Mental representations constructed

by experts and novices in object-oriented program comprehension. Proceedings of INTERACT+97: 6th IFIP International Conference on Human}Computer Interaction, pp. 339}346. Amsterdam: North-Holland. BURKHARDT, J.-M., DE

D TIENNE,

F., & WIEDENBECK, S. (1998). The e!ect

  • f
  • bject-
  • riented programming expertise in several dimensions of comprehension strategies. IWPC+98:

Proceedings of the Sixth International Workshop on Program Comprehension, pp. 82}89. New York: IEEE. CANFORA, G., MANCINI, L., & TORTORELLA, M. (1996). A workbench for program comprehension during software maintenance. Proceedings of the Fourth Workshop on Program Comprehension, pp. 30}39. Los Alamitos, CA: IEEE Computer Society. COHEN, J. (1977). Statistical Power Analysis for the Social Sciences. New York: Academic. CORRITORE, C. L., & WIEDENBECK, S. (1991). What do novices learn during program comprehen- sion? International Journal of Human}Computer Interaction, 3, 199}208. CORRITORE, C. L., & WIEDENBECK, S. (1999). Mental representations of expert procedural and

  • bject-oriented programmers in a software maintenance task. International Journal of Hu-

man}Computer Studies, 50, 61}83. COX, B. (1986). Object-Oriented Programming. Reading, MA: Addison-Wesley. DALY, J., BROOKS, A., MILLER, J., ROPER, M., & WOOD, M. (1996). Evaluating inheritance depth on the maintainability of object-oriented software. Empirical Software Engineering, 1, 109}132. FISHER, G. (1987). Cognitive view of reuse and design. IEEE Software, 7, 60}72. GELLENBECK, E. M., & COOK, C. R. (1991). An investigation of procedure and variable names as beacons during program comprehension. In J. KOENEMANN-BELLIVEAU, T. G. MOHER, & S. P. ROBERTSON, Eds. Empirical Studies of Programmers: Fourth Workshop, pp. 65}79. Norwood, NJ: Ablex. GILMORE, D. J., & GREEN, T. R. G. (1984). Comprehension and recall of miniature programs. International Journal of Man}Machine Studies, 21, 31}48. GWINN, J. (1992). Object-oriented programs in realtime. SIGPLAN Notices, 27, 47}56. HENRY, S., & HUMPHREY, M. (1993). Object-oriented vs. procedural programming languages: e!ectiveness in program maintenance. Journal of Object-Oriented Programming, 6, 41}49. JACKY, J., & KALET, I. (1987). An object-oriented programming discipline for standard Pascal. Communications of the ACM, 30, 772}776. KOENEMANN, J., & ROBERTSON, S. (1991). Expert problem solving strategies for program compre-

  • hension. CHI+91 Proceedings, pp. 125}130. New York: ACM.

LANGE, B., & MOHER, T. (1989). Some strategies of reuse in an object-oriented programming

  • environment. CHI+89 Proceedings, pp. 69}73. New York: ACM.

PROGRAM COMPREHENSION STRATEGIES

21

slide-22
SLIDE 22

LAYZELL, P., CHAMPION, R., & FREEMAN, M. (1993). DOCKET: program comprehension-in-the-

  • large. Proceedings of the IEEE 2nd Workshop on Program Comprehension, pp. 140}148. Los

Alamitos, CA: IEEE Computer Society. LEE, A., & PENNINGTON, N. (1994). The e!ects of paradigm on cognitive activities in design. International Journal of Human}Computer Studies, 40, 577}601. LETOVSKY, S. (1986). Cognitive processes in program comprehension. In E. Soloway, & S. Iyengar,

  • Eds. Empirical Studies of Programmers, pp. 80}98. Norwood, NJ: Ablex.

LITTMAN, D. C., PINTO, J., LETOVSKY, S., & SOLOWAY, E. (1986). Mental models and software

  • maintenance. In E. SOLOWAY, & S. IYENGAR, Eds. Empirical Studies of Programmers, pp. 80}98.

Norwood, NJ: Ablex. PENNINGTON, N. (1987a). Comprehension strategies in programming. In G. Olson, S. Sheppard, & E. Soloway, Eds. Empirical Studies of Programmers: 2nd Workshop, pp. 100}113. Norwood, NJ: Ablex. PENNINGTON, N. (1987b). Stimulus structures and mental representations in expert comprehen- sion of computer programs. Cognitive Psychology, 19, 295}341. PENNINGTON, N., LEE, A. Y., & REHDER, B. (1995). Cognitive activities and levels of abstraction in procedural and object-oriented design. Human}Computer Interaction, 10, 171}226. RAJLICH, V., DORAN, J., & GUDLA, R. T. S. (1994). Layered explanations of software: a methodo- logy for program comprehension. Proceedings of the 3rd Workshop on Program Comprehension,

  • pp. 46}52. Los Alamitos, CA: IEEE Computer Society.

RENTSCH, T. (1982). Object-oriented programming. SIGPLAN Notices, 17, 51}57. ROSSON, M. B., & ALPERT, S. R. (1990). The cognitive consequences of object-oriented design. Human}Computer Interaction, 5, 345}379. SHAFT, T. M., & VESSEY, I. (1995). The relevance of application domain knowledge: the case of computer program comprehension. Information Systems Research, 6, 286}299. SHNEIDERMAN, B., & MAYER, R. (1979). Syntactic/semantic interactions in programmer behavior: a model and experimental results. International Journal of Computer and Information Sciences, 8, 219}238. SOLOWAY, E., ADELSON, B., & EHRLICH, K. (1988). Knowledge and processes in the comprehen- sion of computer programs, In M. CHI, R. GLASER, & M. FARR, Eds. The Nature of Expertise, pp. 129}152. Mahway, NJ: Lawrence Erlbaum. SOLOWAY, E., & EHRLICH, K. (1984). Empirical studies of programming knowledge. IEEE Transac- tions on Software Engineering, SE-10, 595}609. SOLOWAY, E., EHRLICH, K., BONAR, J., & GREENSPAN, J. (1982). What do novices know about programming? In A. Badre & B. Shneiderman, Eds. Directions in Human}Computer Interaction,

  • pp. 27}54. Norwood, NJ: Ablex.

SOLOWAY, E., PINTO, J., LETOVSKY, S., LITTMAN, D., & LAMPERT, R. (1988). Designing documentation to compensate for delocalized plans. Communications of the ACM, 31, 1259}1267. TILLEY, S. R., PAUL, S., & SMITH, D. R. (1996). Towards a framework for program understanding. Proceedings of the 4th Workshop on Program Comprehension, pp. 19}28. Los Alamitos, CA: IEEE Computer Society. VON MAYRHAUSER, A., & VANS, A. M. (1995a). Program comprehension during software mainten- ance and evolution. Computer, 28, 44}55. VON MAYRHAUSER, A., & VANS, A. M. (1995b). Program understanding: models and experiments. In M. C. Yovits, & M. V. Zelkowitz, Eds. Advances in Computers, Vol. 40, pp. 1}38. San Diego: Academic Press. VON MAYRHAUSER, A., & VANS, A. M. (1996). Identi"cation of dynamic comprehension processes during large scale maintenance. IEEE Transactions on Software Engineering, 22, 424}437. VON MAYRHAUSER, A., & VANS, A. M. (1997). Program understanding behavior during debugging

  • f large scale software. In S. WIEDENBECK, & J. C. SCHOLTZ, Eds. Empirical Studies of Program-

mers: Seventh Workshop, pp. 157}179. Norwood, NJ: Ablex. VON MAYRHAUSER, A., & VANS, A. M. (1998). Program understanding behavior during adapta- tion of large scale software. IWPC+98: Proceedings of the 6th International Workshop on Program Comprehension, pp. 164}172. Los Alamitos, CA: IEEE Computer Society.

22

  • C. L. CORRITORE AND S. WIEDENBECK
slide-23
SLIDE 23

WIEDENBECK, S. (1986). Beacons in computer program comprehension. International Journal of Man}Machine Studies, 25, 697}709. WIEDENBECK, S. (1991). The initial stage of program comprehension. International Journal of Man}Machine Studies, 35, 517}540. Paper accepted for publication by Associate Editor, Dr Ruven Brooks.

PROGRAM COMPREHENSION STRATEGIES

23