F1 11/18/2005 10:00 AM L ET ' S M AKE B UGS M ISERABLE Anibal Sousa - - PDF document

f1
SMART_READER_LITE
LIVE PREVIEW

F1 11/18/2005 10:00 AM L ET ' S M AKE B UGS M ISERABLE Anibal Sousa - - PDF document

BIO PRESENTATION PAPER F1 11/18/2005 10:00 AM L ET ' S M AKE B UGS M ISERABLE Anibal Sousa Microsoft Corporation International Conference On Software Testing Analysis & Review November 14-18, 2005 Anaheim, CA USA Anibal Sousa Anibal


slide-1
SLIDE 1

BIO PRESENTATION PAPER International Conference On Software Testing Analysis & Review November 14-18, 2005 Anaheim, CA USA

F1

11/18/2005 10:00 AM

LET'S MAKE BUGS MISERABLE

Anibal Sousa Microsoft Corporation

slide-2
SLIDE 2

Anibal Sousa

Anibal Sousa is a Test Manager for Microsoft in the Microsoft Business Solutions division, working in the Business Contacts Manager for Outlook product. He has about 15 years of experience working in many IT departments, from software development to customer support. Anibal joined Microsoft in 1998, and since then has worked in the Testing discipline shipping many different products, like Exchange Conferencing and Instant Messaging Servers, GDI+ and Business Contact Manager for Outlook Versions 1 and 2. He is passionate about Testing and Quality Assurance, always looking for ways to improve the software development and testing processes, with focus on new methodologies and practices, like Test Templates, Risk Based Testing, Model Based Testing, etc. Anibal has a Master of Science degree in Computer Science from PUC/RJ in Brazil, and loves his family and soccer.

slide-3
SLIDE 3

Let’s Make Bugs Let’s Make Bugs Miserable Miserable

Anibal Sousa Anibal Sousa

Anibal.Sousa@microsoft.com Anibal.Sousa@microsoft.com Test Manager Test Manager Microsoft Corp. Microsoft Corp.

slide-4
SLIDE 4

Scope of this presentation - Agenda Scope of this presentation - Agenda

  • Defining a bug database

Defining a bug database

  • Spec critique process

Spec critique process

  • Pre check in tools

Pre check in tools

  • Looking for more and different bugs

Looking for more and different bugs

  • Bug bar and Triage process

Bug bar and Triage process

  • Bug charts and data analysis

Bug charts and data analysis

  • Making best use of bugs found by customers

Making best use of bugs found by customers

slide-5
SLIDE 5

Who am I? Who am I?

  • 8 years working for Microsoft in Test area

8 years working for Microsoft in Test area

  • Products I worked on:

Products I worked on:

  • Instant Messaging Server 2000

Instant Messaging Server 2000

  • Conferencing Messaging Server 2000

Conferencing Messaging Server 2000

  • GDI+ versions 1.0 and 1.1

GDI+ versions 1.0 and 1.1

  • Business Contact Manager versions 1.0 and 2.0

Business Contact Manager versions 1.0 and 2.0

  • I will be talking about ideas, processes and practices

I will be talking about ideas, processes and practices developed and tested during these 8 years developed and tested during these 8 years

slide-6
SLIDE 6

Motivation Motivation

  • Bugs will always be there!

Bugs will always be there!

  • Can we prevent the creation of bugs?

Can we prevent the creation of bugs?

  • How do we keep control of bugs?

How do we keep control of bugs?

  • Are we fixing the right bugs?

Are we fixing the right bugs?

  • Are they being fixed correctly?

Are they being fixed correctly?

  • Where should I look for more bugs?

Where should I look for more bugs?

  • What bugs did I miss?

What bugs did I miss?

slide-7
SLIDE 7

First, you need a bug database, but ... First, you need a bug database, but ...

  • Can you afford to buy one? If not, you could use

Can you afford to buy one? If not, you could use Excel, SQL, etc. Excel, SQL, etc.

  • After you have it, what kind of data will you

After you have it, what kind of data will you store? store?

  • It should simplify the bug management process,

It should simplify the bug management process, for you and your team members for you and your team members

  • And after you start colleting data, what can you

And after you start colleting data, what can you do with it? do with it?

slide-8
SLIDE 8

Life cycle of a bug Life cycle of a bug

Bug is found Closed bug Bug is open but not assigned Bug gets into the system Active bug assigned to developer Triage inspects bug and assigns to developer Resolved bug Bug gets resolved Bug is verified then closed Bug was not resolved correctly Bug goes back to Triage for orientation

slide-9
SLIDE 9

Some key fields the DB should have: Some key fields the DB should have:

  • Title

Title

  • Repro Steps

Repro Steps

  • Feature area

Feature area

  • Severity

Severity

  • Status

Status

  • Type

Type

  • How found

How found

  • Test Case ID

Test Case ID

  • Release

Release

  • Assigned to (owner)

Assigned to (owner)

  • Priority

Priority

  • Resolution

Resolution

  • History log

History log

  • Opened by and date

Opened by and date

  • Resolved by and date

Resolved by and date

  • Closed by and date

Closed by and date

  • UA Impact

UA Impact

  • Triage status

Triage status

slide-10
SLIDE 10

Spec Critique/Review, driven by Test Spec Critique/Review, driven by Test

Requirements, feedback, surveys, bugs, etc. Spec Draft - Ready for Review Reviewed Spec - Ready for Critique 1 - PM creates draft of Spec based on requirements and additional information 2 - Feature team provides input to the Spec document Spec Ready - Ready for Coding 3 – Team members review the spec, and critique meeting occurs Are there open issues with Spec document? 4a - No 4b - Yes

slide-11
SLIDE 11

Killing bugs before they get you Killing bugs before they get you

  • Code Review (dev and test)

Code Review (dev and test)

  • Buddy Build and Buddy Test

Buddy Build and Buddy Test

  • Pre Check in tools

Pre Check in tools

  • Automated daily builds

Automated daily builds

  • Automated Code Coverage builds

Automated Code Coverage builds

  • Automated tests and tools execution

Automated tests and tools execution

slide-12
SLIDE 12

Looking for different bugs Looking for different bugs (and the bugs you missed) (and the bugs you missed)

  • Focus and Ad hoc days

Focus and Ad hoc days

  • Bug hunts

Bug hunts

  • Bug bashes

Bug bashes

  • Feature rotation between testers

Feature rotation between testers

Different eyes will find different problems!

slide-13
SLIDE 13

Bug Bar – motivation (1/2) Bug Bar – motivation (1/2)

  • 5000+ bugs during product development life

5000+ bugs during product development life cycle cycle

  • Subjective and not deterministic process for

Subjective and not deterministic process for triaging bugs triaging bugs

  • Testers were confused about what kind of bugs

Testers were confused about what kind of bugs were still being accepted – they wanted to be able were still being accepted – they wanted to be able to focus on the right set of problems to to focus on the right set of problems to investigate investigate

  • Bad for morale and team engagement

Bad for morale and team engagement

slide-14
SLIDE 14

Bug Bar – motivation (2/2) Bug Bar – motivation (2/2)

  • Bugs will occur during the whole project

Bugs will occur during the whole project

  • Quality should go up and not down

Quality should go up and not down

  • Risk of regression gets higher with time

Risk of regression gets higher with time

  • Not all features get ready at the same time

Not all features get ready at the same time

slide-15
SLIDE 15

Bug Bar is the proposal! Bug Bar is the proposal!

These might not be even considered bugs. Wish 7 This bug is hard to find, not noticeable, causes minor problems (or none) and can be ignored. Improvement 6 This is a bug, but might be obscure, rare or have small impact. Normally it has an easy workaround. Bug 5 This bug might be very visible and affect functionality significantly. In case it has a workaround, it is not so obvious or simple. Knowledge Base (KB) article 4 This bug will cause Customer Support calls. Publishing KB article is not enough, since workaround may not exist or be too complicated. PSS 3 Serious bug. If a customer finds this bug, we will have to issue a QFE (patch). QFE 2 Reasonable region of a feature is not working as expected because of the bug, and there is no workaround. We can not ship the product with this bug! Ship Stopper 1 Definition Name Impact

Define the bug categories:

slide-16
SLIDE 16

Bug Bar @ Alpha stage Bug Bar @ Alpha stage

  • Each feature is considered independently

Each feature is considered independently

  • Each feature will have its bar changed thru time

Each feature will have its bar changed thru time

  • When deciding bar, consider: new/old feature, risk and

When deciding bar, consider: new/old feature, risk and impact of changes, development stage, coverage, etc. impact of changes, development stage, coverage, etc.

Wish 7

  • Improvement

6

  • Bug

5

  • KB

4

  • PSS

3

  • QFE

2

  • Ship Stopper

1 User Assistance SBA Integration PDA Performance Forms Import Export Reports Areas

slide-17
SLIDE 17

Bug Bar @ Beta and RTM stages Bug Bar @ Beta and RTM stages

KB 4 PSS 3

  • QFE

2

  • Ship Stopper

1 User Assistance SBA Integration PDA Performance Forms Import Export Reports Areas Bug 5

  • KB

4

  • PSS

3

  • QFE

2

  • Ship Stopper

1 User Assistance SBA Integration PDA Performance Forms Import Export Reports Areas

slide-18
SLIDE 18

Bug triage (aka, war team) Bug triage (aka, war team)

TESTER DEVELOPER TRIAGE/ WARTEAM 1 – Tester opens bug and assigns to Triage 2.1 – Triage assigns the bug back to tester for more information, or resolves the bug, applying the bug bar 2.2 – Triage assigns the bug to developer for investigation 3 – Developer assigns the bug back to Triage, with all necessary information to help Triage, like impact, risk, code reviewer and tester, etc. 4.1 – Triage assigns the bug back to developer for check-in 4.2 – Triage decides to not fix the bug, so assigns to Test for follow-up 5 – Developer checks-in the fix and assign the bug to tester for closure 6 – Tester verifies the bug is fixed, writes automation, update Test Plan, etc., and then closes the bug

slide-19
SLIDE 19

After the bug bar process … After the bug bar process …

  • Clear communication of results

Clear communication of results

  • Team engagement and commitment – high morale

Team engagement and commitment – high morale

  • Number of regressions (metrics) went down

Number of regressions (metrics) went down significantly significantly

  • Number of opened bugs did not go down, but accepted

Number of opened bugs did not go down, but accepted bugs did bugs did

  • Product was shipped in timely manner

Product was shipped in timely manner

slide-20
SLIDE 20

Metrics (using Excel and Pivot Tables) Metrics (using Excel and Pivot Tables)

20 40 60 80 100 120 140 160 Developers Bugs assigned 4 3 2 1

Weekly Resolution of bugs - Linear

50 100 150 200 250 Weeks Bugs resolved Won't Fix Postponed Not Repro Fixed External Duplicate By Design

  • Who is working on the bugs?
  • Which area has the most bugs?
  • How are we resolving the bugs?
  • What is the quality of the bugs?
slide-21
SLIDE 21

More metrics – Bug Trends More metrics – Bug Trends

Bug Trends

200 400 600 800 1000 1200 Weeks Bugs Opened Resolved Closed Active

  • Poly. (Active)

When are we going to be able to ship the product?

slide-22
SLIDE 22

Release Criteria Release Criteria

  • Use can use bugs as part of your Exit

Use can use bugs as part of your Exit and Release Criteria and Release Criteria

80% 80% 60% 50% 50% 40% Pri 4 TC Pass Rate 90% 90% 80% 70% 50% 40% Pri 3 TC Pass Rate 100% 100% 100% 90% 80% 60% Pri 2 TC Pass Rate 100% 100% 100% 100% 100% 100% Pri 1 TC Pass Rate 0 active 0 active 0 active 0 active NA NA Pri 2 Bugs 0 active 0 active 0 active 0 active 0 active 0 active Pri 1 Bugs RTM RC1 Beta Alpha M3 M2 Criteria/Goals

slide-23
SLIDE 23

Looking for bugs after they are Looking for bugs after they are released released

  • Microsoft Customer Experience

Microsoft Customer Experience

  • Microsoft Error Reporting

Microsoft Error Reporting (Watson) (Watson)

  • Customer Support

Customer Support

  • Internet newsgroups

Internet newsgroups

slide-24
SLIDE 24

Questions? Questions?

Anibal Sousa Anibal Sousa Anibal.Sousa@microsoft.com Anibal.Sousa@microsoft.com Anibal_Sousa@hotmail.com Anibal_Sousa@hotmail.com

Thank you! Thank you!

Check the white paper for more information about this presentation, Check the white paper for more information about this presentation,

  • r feel free to contact me at anytime.
  • r feel free to contact me at anytime.
slide-25
SLIDE 25

Let’ Let’s Mak Make Bugs Miser Bugs Miserable! ble!

Best practices, recommendations and ideas to deal with software development bugs during the product life cycle Anibal Sousa

Anibal.Sousa@Microsoft.com T est Manager Microsoft Corporation StarWest 2005

slide-26
SLIDE 26

. . . . . . . . .

List of Contents

Introduction..................................................................................................................................................1 Why does this document exist? .................................................................................................................1 I got a bug, then what? ...............................................................................................................................1 Bug prevention is important!.......................................................................................................................3 Few ideas to prevent bugs from getting into the product ..........................................................................4 Code Review, Buddy Build and Buddy Testing........................................................................................... 4 Pre-Check in tools ......................................................................................................................................... 5 Automated daily builds and Code Coverage builds..................................................................................... 5 Automated tests execution and code validation tools.................................................................................. 5 Looking for bugs in different ways..............................................................................................................5 Bug hunts, Focus and Adhoc days............................................................................................................... 5 Bug Bashes.................................................................................................................................................... 5 Feature rotation.............................................................................................................................................. 6 Why should we have a bug bar?................................................................................................................6 Motivation....................................................................................................................................................... 6 Process .......................................................................................................................................................... 7 Results............................................................................................................................................................ 9 Too many bugs - Triage/War team process ..............................................................................................9 Simple bug database analysis and charts .............................................................................................. 10 Bugs and their relation to Exit and Release Criteria............................................................................... 12 Few more ways to get bugs directly from customers............................................................................. 13 Instrumentation (Customer Experience and Microsoft Error Reporting, a.k.a. Watson) .........................13 Customer Support .......................................................................................................................................13 Newsgroups.................................................................................................................................................13 Conclusion ............................................................................................................................................... 13

slide-27
SLIDE 27

1

Introduction

During years of experience in shipping software at Microsoft, many different best practices, processes and ideas were introduced to me, some were modified by me and my team and new ones proposed. This document presents many of these according to the different stages of the Software Development Lifecycle (SDL). Note: This document will concentrate on how to reduce the impact of bugs that already exist, touching slightly some techniques that could be deployed to prevent them from becoming real. The focus is on what the Test Manager and the whole Test Team can do to increase the quality of the software, having key participation or driving most of the processes to achieve this.

Why does this document exist?

There are many techniques out there to prevent bugs from happening in the real life, but the truth is that they will always be there, one way or another – this is a fact which can’t be ignored. Starting from this point, we have to make sure that:

  • Is there anything we can do to prevent bugs from occurring at all? Even before becoming code

bugs?

  • We are finding as many bugs as we can, and we are also finding them as early as possible;
  • Bugs are being taken care in a correct manner (example: not all bugs have to be fixed);
  • The bugs that are being fixed, are being fixed in a correct way, and enforce that they won’t

come back to haunt us in the future;

  • Are there siblings of the bugs found by your team? What kind of bugs does you product have?

What information can you extract from past bugs? These questions and goals are the motivation of this document. We know that product bugs will occur, and that we will have to deal with them. So we want to be proactive, efficient and thorough – bugs are not our friends, so we want to make their life a short one and a bad experience overall for them – Let’s make their lives Miserable!

I got a bug, then what?

Since you know that bugs will be in your way, it is important that you have a way to store, catalog and have easy access to all bugs found during the SDL. There are commercial products available that provide this functionality, but you could also use homegrown tools, or simply a database to store them. The important aspects to be considered are:

1) What information you will store for every bug found. If you don’t save good data, the data might

become useless;

2) What you will do with this information as it starts to grow. Now that you have data, you need to

act on it;

3) The tool should allow the whole team to use it, and simplify the workflow, maximizing

efficiency.

slide-28
SLIDE 28

2

There is commercial software that provides this functionality, but as we are in cost reduction times, I will present a list of some of the fields that the bug database should have, in case you decide to develop you own system, and also a flow that the bug database front end should provide to the team members. In the list below, I present 2 columns: the name of the field and a brief explanation why I think you should have this field. Field Name Explanation Title Short but precise description of the bug. Ideally, just reading the title of the bug should give a good idea about the bug. Repro steps List of steps that can reproduce the bug, hopefully in any machine and by any person. Any additional data to help repro the bug should be in the bug report – developers will thank you for this. Test Case ID The test case that led the tester to find the bug. Severity This indicates how bad the bug is. Normally it is numeric and has a preset range of values, like 1 to 4. Problem Type If you use the database to store different things, like product bugs, work items, spec defects, etc., this field could become necessary. Description / History New data found during the investigation of the bug, or changes in its state, should be logged in the history of the bug. This field could be adding-only. Feature / area The feature of the product where the bug was found. This is important to identify the most problematic areas (Pareto analysis). Release The bug might have a specific point in time when it should be fixed, like a milestone or special event. This field could be used for this goal. Opened by The unique identifier of the person entered the bug in the database. Opened date The date the bug was entered. This field will be important to track bug activity and future trends. How found As a way to improve testing practices, this field would store how the bug was found, like by manual or automated testing, ad-hoc or bug bash, test pass, etc. Source It might be necessary to keep data about who really found the problem, like a previous customer, alpha or beta tester, etc. Assigned to / Owner During the many different stages in the bug’s life, many different people will be working on the bug, and this is where this field becomes necessary. By querying this field, every team member can check which bugs are waiting for their input or action. Status We need to know if the bug is active, resolved or closed. Priority By comparing the priority of other bugs, anyone can determine which ones should be fixed first. Normally it is set by senior team members. Resolved by The person that solved the problem or decided its fate, like triage. Resolved date The date when it was resolved. This could be used to track progress and estimate future bug resolution trend. Resolution How the bug was solved? Some possibilities are: Fixed, Not Repro, Won’t Fix, Dupe, External, By Design, Postponed, etc. Closed by The person that closed the bug, after assuring it was resolved correctly. Closed date The date the bug was closed in the system. UA impact If you product ships in many different languages and/or has documentation, you might need a way to track changes in the UI of the product, which would have an impact in the documentation or localization

slide-29
SLIDE 29

3

efforts. Triage status This field will be explained in more details later, but it could be used to track progress while the bug is still active. Also, to make the access and control of bugs simpler, the tool should facilitate team members to interact with the bugs throughout the possible stages of a product bug. Here is a common flow of a bug:

Bug is found Closed bug Bug is open but not assigned Bug gets into the system Active bug assigned to developer Triage inspects bug and assigns to developer Resolved bug Bug gets resolved Bug is verified then closed Bug was not resolved correctly Bug goes back to Triage for orientation

Some additional features the tool could provide are: ability to query the bugs you opened, bugs assigned to you, bugs per product feature and/or milestone, etc. Notifications, like when a bug gets assigned to you, or someone modified a bug you opened are great, but not mandatory.

Bug prevention is important!

It is known that the sooner we can identify and fix bugs, cheaper and less risky it will be. Because of this, every product team should try to identify them as soon as possible; and a very early spot where we can find bugs is in the product specifications and requirements. One procedure deployed in my current group at Microsoft that was very successful was to review and critique the specs written by the Project Managers (PM). It is important to note that this process was driven and owned by the Test team. Here is a series of steps showing this process and its main activities:

1) PMs collect the customer requirements and translate them into a product specification; 2) The Feature Team (consisted of at least one member of each discipline: PM, Development

and Testing) participate actively in creating the document. Later it reviews the spec internally and bring it to a quality level where it can be reviewed by team members outside of the feature team (Ready for Critique);

3) When the spec is at Critique Level, the tester in charge of this feature schedules a critique

  • meeting. He/she also sends the document in advance to team members so they can read and

critique the document. They open bugs and assign them to the PM prior to the meeting. Ideally before the meeting, PM will resolve all open issues in a correct manner, often updating the spec with more information or clarifications. Remaining issues wait for the meeting;

4) The meeting occurs, where all open issues and resolved issues are discussed. At this moment

2 things can happen:

slide-30
SLIDE 30

4

a) There are no more open issues: at this moment the development team can start coding, and tester should continue with his test plan document (ideally, should be in progress already); b) There are still issues: PM needs to resolve them, with the help of the feature team. After updates are completed, step 3 is repeated.

Requirements, feedback, surveys, bugs, etc. Spec Draft - Ready for Review Reviewed Spec - Ready for Critique 1 - PM creates draft of Spec based on requirements and additional information 2 - Feature team provides input to the Spec document Spec Ready - Ready for Coding 3 – Team members review the spec, and critique meeting occurs Are there open issues with Spec document? 4a - No 4b - Yes

Even though these steps look simple, it is hard to stick to them. To prevent this from happening, it is critical that: this process is driven by the Tester, and that he/she is empowered to declare the spec ready for coding. This way the feature won’t start being coded until the open issues are resolved; the team is aware of the process and engaged into making it successful. But nothing prevents the tester to start writing his Test Plan or Spec while this process is occurring – it can happen in parallel, what could trigger good discussions during the review and critique meetings.

Few ideas to prevent bugs from getting into the product

Even though we perform Spec Reviews at work, bugs always slip through. But there are other layers of defense that everybody could deploy. Below I present some that many different groups at Microsoft currently have in place:

Code Review, Buddy Build and Buddy Testing

Even though not all testers have access to product’s source code, or even are familiar with code programming, executing buddy build (building the product in your machine with the proposed changes from the developer) and buddy testing it (after you build the product run some tests to verify if it works as expected and no regressions were introduced) are good practices most testers could perform. In case the tester knows how to write code, get him/her to do code reviews for the developers - most developers will appreciate having more eyes on his code, assuring his code meets the standards we require. There are only advantages of doing Code Review, from another developer or from a tester – make it a practice and your team will appreciate it in the future.

slide-31
SLIDE 31

5

Pre-Check in tools

Another good idea is to build the product and then run automated tests before the developer’s changes get into the product build system. Basically, if your team can afford some machines and more test code development, you should be able to have automated tests that would run for every check in into the product source code tree before it really occurs – in case any test fails, the check in is aborted, so the developer either has to fix his code

  • r the automated test. A possible alternative, in case you can’t afford it, would be to convince your development

team to do it☺. This will actually benefit them more than you.

Automated daily builds and Code Coverage builds

But, in case the idea above is not feasible, I still got few more for you. One big serious problem is when there is a build break, when there is no product to be tested, what makes the whole team go into waiting mode. Also, you should make every effort to have recent product builds, this way you would be maximizing the results of your test team, as they would be testing the most recent version of the product, not outdated ones. So what I am suggesting is to have as often builds as possible, like daily builds. And if you implement this idea, why not do the same for Code Coverage builds? This way your test team can have them easily available whenever they want to perform code coverage analysis..

Automated tests execution and code validation tools

And if you want to take another extra step: produce the build automatically at night, like at midnight or so, and then also run all or some of automated tests you might have, including development ones. You could also run some tools that will verify your source code for bugs and bad practices, like FXCOP. This way, the next day when you arrive at work, you will not only have a new product build to test, but results from the automation, which will indicate how good or bad it is. If you are able to deploy any (or all, hopefully) of these ideas, you will be taking few steps towards a higher quality product; and your test team will become more efficient and less frustrated too.

Looking for bugs in different ways

It is common to have product features assigned to individual testers, who conduct spec reviews, write the test spec, execute it and provide support to the feature developer. But if you have just the same group of people looking at the feature, they could get “too” used to the feature, ignoring problems and bugs that someone with fresh eyes would not let slip through. The techniques below focus in this concept, and normally produce significant results when deployed in working groups.

Bug hunts, Focus and Adhoc days

There is a chance that some testers end up spending too much time developing test automation, ignoring adhoc and/or manual testing – it is important to keep a good balance, and because of this we used bug hunts and adhoc days while testing past MS products. This practice would set some specific time where test team members would simply play, test and have fun testing the product, focusing on specific areas or picking random areas, depending on the need and stage of the product: let’s say there is a big check in that occurred recently, and the feature tester is worried about possible regressions – this could become the focus of a bug hunt.

Bug Bashes

Similar to the bug hunt, the major difference in the bug bash is that it involves the whole team, not just testers – you should get developers, program managers, documentation writers, product support folks, etc. to participate in the bug bash. Again, depending on the situation, it could be focused or free for all. But to maximize its results,

slide-32
SLIDE 32

6

I’d recommend preparing for it and getting people engaged and motivated for it: decide the scope of it, send invitations in advance so participants can set time aside for it, provide prizes for top bug finders, and send report with the results after it is completed. If you do it right, everybody will see the value that bug bashes bring, and will ask for future ones (get the prizes, and people will require more bug bashes in the future!).

Feature rotation

One more practice that often produces good results is to perform some shuffles in the feature assignments in the test team. It might be hard in some cases , where the ramp up time would be too time consuming, but in many cases it is a good idea, since the feature tester might be tired and used to the feature – remember that new and fresh eyes will likely see new bugs you missed for no good reason. At this moment, I hope the message I am trying to deliver is clear: even though the feature tester will likely be the person that knows and understands the feature the best, getting other people to look at it will only increase the team confidence on the feature and the tester (or present you with a problem to solve, but at least ahead in the game).

Why should we have a bug bar?

In this section, I will talk about one common problem that will occur in any product development team: do we have to fix this bug? Is there any difference if the same bug is found in the early or late stages of the SDL? How can we get the whole team to understand and agree on a process to determine the fate of product bugs? If this is interesting to you, go ahead and check this section.

Motivation

At every product I worked at Microsoft, there was always discussion about “why isn’t this bug being fixed?” It is common to see developers wanting to fix bugs because it is easy to, or testers that get frustrated because the problems they find don’t get fixed in the product. If this is not the case in your group and/or company, what about these other issues:

  • Every product will have bugs, no matter how good and tight the team and process are. As an

example, during BCM V1 development, more than 5000 bugs were found;

  • Are testers confused about what kind of bugs are being fixed or rejected by triage team? If

there is confusion, how can they focus on the right set of problems to investigate?

  • Is the process for triaging bugs non deterministic or random, or maybe even worse, like

subjective to role or status of triage members in the group?

  • Do you think the morale and engagement of team members is affected by the problems

above?

  • Do you agree that bugs will occur during the whole project, and that product overall quality

should go up and not down?

  • Do you agree that as we get deeper in the development life cycle, the risk of regression gets

higher? And do you believe that not all features will get ready at the same time? And that it will be hard to have one unique rule that applies to all features in the product because of the points above?

  • Last but not least, are you considering all aspects when evaluating all bugs, like the cost of

fixing it and testing it, risk of fixing or not fixing it, impact to other teams, like User Assistance, Product Support, Localization and Translation, Marketing, etc?

slide-33
SLIDE 33

7

So, how can we solve some, if not all of the problems above? Having a clear bug bar and enforcing it will significantly help. If you think so, keep on reading.

Process

There was a lot of ambiguity while triaging bugs, and this was causing problems to the morale of the group, so whatever solution we wanted to deploy would require the buy-off and agreement from the whole team. As first step, I created the table below, where I made very clear: what was the impact of the bug, a nickname that would map to the category of the bug, and a quick definition of the category. Impact Name Definition 1 Ship Stopper Reasonable region of a feature is not working as expected because of the bug, and there is no workaround. We can not ship the product with this bug active. 2 QFE Serious bug. If a customer finds this bug, we will have to issue a QFE (patch). 3 PSS This bug will cause Customer Support calls. Publishing KB article is not enough, since workaround may not exist or be too complicated. 4 Knowledge Base (KB) This bug might be very visible and affect functionality significantly. In case it has a workaround, it is not so obvious or simple. 5 Bug This is a bug, but might be obscure, rare or have small impact. Normally it has an easy workaround. 6 Improvement This bug is hard to find, not noticeable, causes minor problems (or none) and can be ignored. 7 Wish These might not be even considered bugs. Another very useful point in the table was the use of colors, making it very easy for team members to understand the impact of the bug to customers and Microsoft. This table was explained to the whole team and put in the team internal’s web site, where it could be accessed by anyone, including Managers and the War team, where it could be used as reference during the triage process. Now that the categories are set, next step is to determine for each feature of the product in which category it is, at this moment of the SDL. There are 2 crucial aspects here:

  • Each feature should be considered independently, as some will be new and others will be

legacy features; some will be under hard development or changes, while others will be stable, etc.;

  • Through time, each feature will change its status, hopefully becoming more stable, and

consequently tightening the type of bugs being accepted. So, for each product feature you need to determine its stage, which will work as a filter mechanism for future

  • bugs. You should consider many different aspects when making this decision, like current milestone in the SDL,

development stage of the feature, if it is a new or legacy feature, risk and impact of making changes, etc. Below I have an example with some features and their corresponding categories. You can see that one feature, SBA integration, was at the Improvement level, since it was a new one that was still being developed, while others, like Reports and Import Export were at QFE level, mostly because they were old features that we did not want to touch (unless big problems were found). This way, more bugs were being accepted in the first one than the last two. At Microsoft we use the term “raise the bar” often, meaning that we are taking less bugs as times goes by, and the table allows members to see this in a graphical way too.

slide-34
SLIDE 34

8

Areas Reports Import Export Forms Performance PDA SBA Integration User Assistance 1 Ship Stopper

  • 2

QFE

  • 3

PSS

  • 4

KB

  • 5

Bug

  • 6

Improvement

  • 7

Wish We stored this table in a web page, with public access to all team members. And it was often used during War team and triage meetings (more on it later). This way it was easy, fast and straightforward to decide what to do when we were faced with bugs. After triage was done, the results and resolutions made during the meeting were sent to the team, with no real big surprises to team members, since they knew what to expect. And every time any feature status changed, a new table was sent to the team and pushed to the web page for future triage meetings – very clear and transparent process. Next I added 2 more tables with the same feature set that show how the bug bar was raised and evolved thru time, approximately around Beta and close to RTM milestones (as time passed, we took less bugs for all features). Beta: Areas Reports Import Export Forms Performance PDA SBA Integration User Assistance 1 Ship Stopper

  • 2

QFE

  • 3

PSS

  • 4

KB

  • 5

Bug 6 Improvement 7 Wish RTM: Areas Reports Import Export Forms Performance PDA SBA Integration User Assistance 1 Ship Stopper

  • 2

QFE

  • 3

PSS 4 KB

slide-35
SLIDE 35

9

5 Bug 6 Improvement 7 Wish

Results

By using this procedure, we were able to ship the product on time, which was a big concern due to the amount

  • f bugs that were being found during testing stage. The morale of the team increased significantly, since

everybody felt important and that the process was fair, focusing on the customer and shipping the best product we could within the business limitations we had. Team members were engaged and continued looking for and

  • pening bugs, even though they knew that some of them would not meet the bar set at the moment for specific

feature – the number of opened bugs did not go down, but the number of fixed bugs did, as expected, since we were not fixing all bugs as before. By applying this bar we also reduced the number of regressions, since we were not taking as many bugs as we would, and we considered the risk and state of the feature when defining the bar. Overall, it showed to be an effective but simple process to handle product bugs.

Too many bugs - Triage/War team process

But even though we had defined the bug bar, the triage process is where the rubber meets the road, and where bugs were ultimately accepted or rejected. But you might be asking: what is this War team or triage team? What do they do? Simply speaking, it is a group of people, with representatives from all disciplines (test, dev, program manager, user assistance, etc.), that makes important decisions about the product and work, including which bugs will get fixed or won’t, among many other things, like schedule, issues, dependencies, etc.. Since we are focusing on bugs, I presented below a quick flow of what happens to a bug from the moment it is found, until it gets closed in the bug database, and the role of the triage team:

TESTER DEVELOPER TRIAGE/ WARTEAM 1 – Tester opens bug and assigns to Triage 2.1 – Triage assigns the bug back to tester for more information, or resolves the bug, applying the bug bar 2.2 – Triage assigns the bug to developer for investigation 3 – Developer assigns the bug back to Triage, with all necessary information to help Triage, like impact, risk, code reviewer and tester, etc. 4.1 – Triage assigns the bug back to developer for check-in 4.2 – Triage decides to not fix the bug, so assigns to Test for follow-up 5 – Developer checks-in the fix and assign the bug to tester for closure 6 – Tester verifies the bug is fixed, writes automation, update Test Plan, etc., and then closes the bug

slide-36
SLIDE 36

10

This process comes with a price though. As you can see, it takes time for a bug to get into the product, but we consider it an acceptable one, especially at critical moments, like when getting close to public releases or end of

  • milestones. Similar or same process is used in many different groups across Microsoft. The combination of the

bug bar and the triage process gives the confidence to all team members that the decisions we are making are the best ones, and are not unilateral or subjective. And in case you want to make the process tighter, you could also enforce some requirements into the overall check in process, like mandating that all product check ins need to have bugs associated to them, and only allow check ins that were approved by triage (as in step 4.1, and that’s where Triage field in the bug database becomes handy). These changes might generate some problems with the developers’ team, especially if currently they have total freedom to make changes to the product. But this brings more control over the product, and also a good side effect, which is that every change will have a bug associated to it, allowing the test team to validate them and become fully aware of all changes in the product! For a tester, this is closest to the ideal

  • scenario. ☺

Simple bug database analysis and charts

Assuming you have the bug database which stores all the information about all bugs found in the product (like a history of the product so far), what else can you do with it? Is there any way to extract more juice from this fruit? You probably have heard of Pareto Analysis, so there is no excuse to not use this technique with all the data you have stored in your database. And as a Test Manager, you always need to collect metrics and indicators of

  • quality. Below I present some tables and charts that can be easily created by using Excel and the Pivot Tables

feature in it, getting raw data directly from the SQL database designed using the fields presented previously (I removed some labels to keep the data confidential). Name: Active bugs to individuals, with priority

20 40 60 80 100 120 140 160 Developers Bugs assigned 4 3 2 1

Why: We have this chart on our team web page. It shows who has most bugs on their plate, and quickly indicates the priority of them. Simply looking at this chart can trigger some load balancing and other activities.

slide-37
SLIDE 37

11

Name: Weekly resolution of product bugs, with types of resolution

Weekly Resolution of bugs - Linear

50 100 150 200 250 Weeks Bugs resolved Won't Fix Postponed Not Repro Fixed External Duplicate By Design

Why: This chart gives you good idea about how active the dev team has been on dealing with the bugs on their plate. It also shows how the bugs are being resolved. In this example, the majority are being fixed (about 60% of them). Name: Active bugs per product feature with priority Count of ID Priority Area 1 2 3 4 Grand Total Customization 18 38 20 2 78 Financial Integration 16 36 17 69 First Use 4 9 4 17 Offline 14 32 28 3 77 PPC - PDA 3 7 11 2 23 Grand Total 55 122 80 7 264 Why: You can see where the active bugs are in the product. It only means that the found bugs are in these areas, but can be used to identify problematic areas, with many bugs, regressions and potential risk to destabilize due to intense check ins. It might be used to pulse check test progress in features, especially if table is generated in weekly manner. Name: General bug trends.

slide-38
SLIDE 38

12 Bug Trends

200 400 600 800 1000 1200 Weeks Bugs Opened Resolved Closed Active

  • Poly. (Active)

Why: If you need to know which direction the product is going, this graph gives you some good

  • data. You can see the trend in active bugs, and the weekly find/resolution/close rates. This

kind of graph is also critical to check the pulse of product and alignment with schedule. These are some of the graphs I quickly generated by using Excel and pivot tables. Modifying them and creating new ones is also very easy (simple drag drop of fields). But the key point here is that with this kind of information you can make better and more accurate decisions, and be more proactive and make adjustments on the fly.

Bugs and their relation to Exit and Release Criteria

Without getting into many details about Release Criteria, which could require a document for itself, I just wanted to point some suggestions and ideas of how to apply some criteria based on number of active and types of bugs in the product (note: this is just part of the criteria you could use, only focusing on the bugs matter). One simple thing I was able to get into product release criteria was the amount of active bugs in the product should not exceed a preset and agreed number. Again, this criterion is defined and blessed by the leadership team, and then communicated to all team members. One more time, we want transparency to the team, and in return expect commitment and engagement from them. Below, I have an example of how we could possibly have bugs and test cases as part of the Exit Criteria of a product thru different milestones. Criteria/Goals M2 M3 Alpha Beta RC1 RTM Pri 1 Bugs 0 active 0 active 0 active 0 active 0 active 0 active Pri 2 Bugs NA NA 0 active 0 active 0 active 0 active Pri 1 TC Pass Rate 100% 100% 100% 100% 100% 100% Pri 2 TC Pass Rate 60% 80% 90% 100% 100% 100% Pri 3 TC Pass Rate 40% 50% 70% 80% 90% 90% Pri 4 TC Pass Rate 40% 50% 50% 60% 80% 80%

slide-39
SLIDE 39

13

Few more ways to get bugs directly from customers

I talked a lot about bugs found during the SDL, but you should be prepared to deal with bugs that customers will find after the product is released. And the more proactive you are, the quicker you will be to respond to them. But how can you become proactive? We are talking about customers outside of your working group, so you need to have ways to communicate with them, or at least hear from them.

Instrumentation (Customer Experience and Microsoft Error Reporting, a.k.a. Watson)

If you have used new Microsoft products, like Windows XP or Office 2003, you probably have seen situations where a problem occurred and a message box showed up asking if you wanted to send this problem to Microsoft, or if you wanted to participate in sending data to make the product better. These are mechanisms developed by Microsoft to help individual teams solve real problems our customers are having. Even though this might be tricky or hard to deploy, since it should not get private or confidential data from customers, and would require technology and servers to store the data, it gives you real critical data about customers’ pain, which then can be used in multiple ways, like quick fixes, future service packs and/or next version improvements. For more data on it, please look for information at the Microsoft web sites.

Customer Support

If your software goes to external customers, it is likely you will have some Customer Support group taking care

  • f your customers. They are the main connection between you and your customers, so you should become

friends with the folks working there. Sometimes I see the PSS group (how we call Customer Support here at Microsoft) as an extension of the test team, since we share many values, concerns, tasks and activities. PSS provides us with tables of incidents, repro scenarios, customer contact information, etc. They are a big help to us, filtering issues, keeping customers happy and allowing us to concentrate on solving the problems. So, there is no reason to have a strong bond with your customer support team.

Newsgroups

Another way to listen to customers is the public newsgroups available in the internet. Even though there is a lot

  • f traffic and noise in there, you can definitely get very useful information, and sometimes connect directly with
  • customers. I have testers always asking for customer feedback and data, and this is a cheap and easy way for

them to gather this kind of information. There are cases where you can’t reproduce bugs, and you might find someone in the newsgroups experiencing it and willing to help you. I always encourage my team members to navigate through them and see what customers are saying about your product, the good, the bad and the ugly.

Conclusion

If you are still reading this document, I hope you had a good time and were able to learn something new and

  • useful. I would be happy if you also believed that some of these ideas are really useful to your product and team
  • too. All of them worked pretty well in my team at Microsoft, and are also being used by other teams. Many of

these ideas were introduced across the years by different people and products, and have survived the test of

  • time. If you have other ones, or have suggestions to improve them, feel free to contact me at

Anibal.Sousa@microsoft.com or Anibal_Sousa@hotmail.com. I wish all of you Happy Testing and good luck in your future endeavors.