Are these really lame reasons not to hire tester? (part 1)

I came across the following post the other day via a link on Linked in that a friend posted. you can see the original post here:- LINK

The general gist of the post by bughuntress is giving details of 5 lame reasons why companies do not hire competent testers, but while reading through it I could not help but feel that maybe the post was being a bit harsh by calling them lame. Therefore over the next few blog posts I will look at each of those 5 “Lame” reasons and assess whether they are as lame as proclaimed by the original post.  This post we look at lack of time or budget.

Lack of budget or time.

What bughuntress says:

Usually this excuse leads to making programmers do testers’ work. Indeed, testing by programmers themselves can save you some money. But you should take into the consideration, that hiring a tester is cheaper than hiring a programmer of the same level, so you will pay programmers for work that testers can do for less salary.

Another obvious thing is that when testing their own code programmers tend to miss some errors which testers wouldn’t. All in all, while developing a complex project it will become clear that testers are more of an investment than useless spending of money

So is lack of budget or time a lame excuse for not hiring testers? in my opinion the answer is a very short one. two letters short.. its “NO”.

Getting into the specifics of what bughntress is saying; I agree that when you don’t have dedicated testers, but you do wish to perform some form of testing, then there is a good chance that the testing will be done by your developers, or maybe your analysts, or BA’s or stakeholders. Are testers cheaper than these other people? probably, but are your testers (should you hire them) going to be always 100% utilised as testers?  if they are not then there will be time when they will be burning cost without adding value, unless they bring other skills in which case your still using one of those resource I mentioned a minute ago.

Also, I think that the view a developer testing their own code are more likely to miss an error that a tester would not is a little old fashioned and a bit of a myth in my experience. Even if they were the case just because you don’t have a dedicated test team, does not mean that a developer has to “mark his own homework”; peer reviews or any of the other roles I mentioned above could add a different viewpoint on the product being tested and add value. With these other things some pretty descent testing can still be achieved without the additional outlay for a dedicated tester in my opinion.

My Verdict: Not a lame excuse. There are very valid reasons why time or budget restrictions would mean that it was not viable or indeed sensible to hire a dedicated tester, and to assume that developers cant test is both wrong and disrespectful in my opinion.

Testing with Questions

Testing is about questioning the software and using that information from the answer we receive to inform our own decisions and those of our stakeholders. Whether we write all our tests out in scripts before we start execution or we explore the software as we go, everything we do will resort to asking a question that we are looking to have answered. I have always felt that when I am testing something I am always looking to learn something new that I did not already know, and by having the fact that my tests are questions at the forefront of my mind it helps me to make the most efficient use of my time while testing.

My definition for testing is “a set of questions used as a means of evaluating the capabilities, effectiveness, and adherence of software”.

I find however that when I speak with other testers about this they either look at me like I am going mad or they say something like “oh, never thought of it that way” which always surprises me somewhat. I find it difficult to see how people who spend their days thinking about, documenting, and executing tests do not make the association that each of their tests is asking a question.

I have also seen that where testers do not make this association they falling into some traps. One of the most common traps I see is where they repeatedly ask the same question over and over again. If a test is not telling you something new then it is not the best use of your time running it. Therefore when they are repeatedly asking the same question they are not learning anything new about the software and given that all testing phases are hamstrung by time this means that potential areas of the system remain a mystery (normally until a production user gets hold of it).

Another thing that I notice is when testers devise tests and have not considered the question they are asking. They lack an understanding of why they want to know the answer that the test is going to give them. When you consider your test a question it forces you to think about not only what you are going to do, but it also why you want to know, and what you are going to do with the information you get from it.

There are 3 main categories of tests that I ask when testing, and they are:

  1. Verifying questions
  2. Investigative questions
  3. Clarifying questions

Verifying questions are those where I am looking to prove or disprove the truth of a known expectation.  These types of questions are primarily those that relate to requirements testing.  That is where you have a requirement of what the software must/must not be able to do and tests that you run will look to specifically answer these questions.

Investigative questions are those where I am examining the software in an attempt to learn something hidden or complex. This is what I see as exploratory testing. Where I don’t have a specific requirement that I am looking to verify, but more where I have a hunch or I am inquisitive to know what will happen under a specific condition. An investigative question I find is often born out of other questions, such as the answer to a verifying question that was unexpected and leads me to think of other potential issues, or scenarios based on this new information.

Finally, Clarifying questions are where I want to challenge an answer that I have already been given by the software and I am either suspicions of or more likely just want to make sure that I fully understand the answer that it has given me. Now I know I said above that asking the same question over and over again is a waste of a testers time, and I stand by that, that is more in relation to the tester not realising they are asking the same question. A clarifying question is where I specifically know that I am asking the same question (maybe in a slightly different way) to ensure that I understand what the current information that I have which is very different.

No really its OK to blame

“It’s not a blame culture that we work within, but at the end of the day it will always be someone’s fault”….

I used to say this phrase a lot tongue in cheek when someone started talking about the “no blame culture”; in fact I still do; but I always believed that laying blame was a non-productive action, and only served to demoralise. My view however seems to have changed on this subject, and although I cannot pinpoint when they changed, it was brought to my attention after reading the following tweet from Jari Laakso:

My immediate response to this was that although I do not agree with giving a tester (or anyone else for that matter) the Guy Fawkes treatment, I did find myself thinking that there may be a valid reason that blame might lay at the door of the tester, or more likely at the door of the test department/team. So I  respond with the following tweet:

Off the back of this tweet I got some response about how blaming a person or team is never the answer. Blaming is bad! I did not agree with this sentiment, yet I felt uncomfortable because it went against what I had thought in the past for such a long time, and I could not articulate even to myself why I now felt so different.

I also noticed on twitter last week that another discussion had broken out about the subject of blame, which had a number of people involved including, but not limited to @JariLaakso, @michaelbolton, @kinofrost, and @maaretp. Again on reading these tweets my initial feeling was that some of these people were getting a little overly sensitive about the “Blame” word.

As my feelings were confusing me I was compelled to try to understand what I really feel about this and why; the following is what I have concluded.

  • In my experience when people refer to blame they are using the negative and informal definition of the word, which is to blast or damn.
  •  If we think of blame in this way then I agree that this is never a desired approach in any situation, and in my opinion unprofessional.
  •  The more formal definition of the word blame is to place responsibility for a fault or error, and this seems to me like a more constructive definition.
  • Saying that there is a no blame culture is basically saying that no one or team is ever responsible for things that are wrong or mistakes that are made. If I said we work in a no accountability culture would you think that was a good thing? I don’t.
  • In most of the situations where I have personally seen blame (using the more formal definition) attributed to a person or team it has appeared to me to be fair and correctly attributed. (in the situations where it has not, it is almost always been because of a snap judgement without doing sufficient investigation… also a bad practice in my view)
  • Assigning responsibility for a fault to a team or person in my opinion is perfectly acceptable, and in fact necessary to ensure that learning’s can be acquired and improvements to process or skills can be achieved.
  • Because people are so afraid to blame, things don’t change. This is a negative result because people think that being critical of someone or their team is a negative thing.
  • Blame does not create and us vs. them situation as was suggested to me on twitter. Us vs. Them situations are created by a lack of respect for each other, a lack of trust, and in some cases a fear of being “found out” by the other.

I don’t buy into this notion that to lay blame somewhere is a bad thing. I do think that it is unprofessional to berate, ridicule, degrade, or make an example of in front of others, but if they were responsible for the fault then I think it is OK to blame them for it, i.e. assign the responsibility to them, once you are convinced that sufficient investigation has been done to identify where the fault originated, because although I think blaming is perfectly fine, blaming the wrong person or team is not!

Measuring Quality during testing

For years I have been guilty of something; well a lot of things actually, just ask my wife; but I mean something specific to testing, and that is fueling a misleading metric. For yeas now I have been using tests pass/fail rate as a way of measuring quality thinking it was telling me something useful. but over the last few months I have started to realise that this metric is bloody dangerous!

So this is the metric that I have been using:
Total no of Passed Tests / Total No. of Tests Executed = Quality

The thinking behind this metric is that at any time during the test cycle I am able to report how good the quality of the application under test is looking, assuming that my test pack is focused on testing the aspects that have been agreed by all stakeholders as being a good measure of overall quality (I am not planing on getting into a debate about what “quality” is in this post). My management and stakeholders love it…but that is the problem.

They should hate it, because it misleads them and basically tells them diddly squat about the quality. Yet they love it. I am in many a conversation where people are getting excited by high quality or losing there lunch because of low quality before testing is finished. So much hangs on this metric with them, testers and developers getting praised/slated based on it, and it is all my fault! I had a hand in bringing this metric into the company, and now I have a responsibly to shut it down before it does more damage.

OK so let me show why this metric is misleading. Below is a table that shows a test pack of 100 tests. The tests can be run at a rate of 10 tests a day and therefore takes 10 days to complete all tests. Scenario 1 finds a number of issues early meaning a lot of failed tests early in the cycle, and scenario 2 has a more even distribution through out the cycle.

  Scenario 1 Scenario 2
  Passed Failed No Run Quality Passed Failed No Run Quality
day 1 5 5 90 50% 9 1 90 90%
day 2 7 13 80 35% 16 4 80 80%
day 3 14 16 70 47% 26 4 70 87%
day 4 15 25 60 38% 34 6 60 85%
day 5 25 25 50 50% 43 7 50 86%
day 6 35 25 40 58% 50 10 40 83%
day 7 45 25 30 64% 56 14 30 80%
day 8 55 25 20 69% 60 20 20 75%
day 9 65 25 10 72% 67 23 10 74%
day 10 75 25 0 75% 75 25 0 75%

As you can see, if we look at day 5 of the 10 day cycle we see that we have two very different statuses. In scenario 1 the quality is at 50%, and r my stakeholders are running around panicking, insisting on twice daily “intensive care” meetings and coming down on the development leads about the “poor” quality of the application. In Scenario 2 however the quality is at 86% and my stakeholders are relaxed, happy and praising the development leads for a great job by them… high fives all round!

The crux of all this though is that by the end of the cycle on day 10 in both scenarios the quality is at 75%, meaning that in scenario 1 my stakeholders were caused to over react, burn resource unnecessarily and come down on the developers harder than was needed. Scenario 2 by contrast has my stakeholders to relaxed and praising everyone when in fact the application actually ends up in a worse position that they thought.

What makes it worse is that I cant even be sure of the final 75% mesurmt and here is why; granularity of tests. Lets assume that our application has 5 functional areas that are to be tested,and because I am a creative kind of guy lets call them “functional area 1”, “functional area 2”, “functional area 3”, “functional area 4”, and “functional area 5”. Now using my crystal ball I am also going to tell you that by the end of the testing cycle functional areas 1, 4 and 5 pass and 2, and 3 have failed. Now look at the below table where again two scenarios are display it shows how in 1 the total number of test is 100 and in the second the tests have been written in a much more granular form and have a total of 1000:

  S1 – No of Tests S2 – No of Tests S1 – % of tests S2 – % of tests  Results
Functional Area 1 200 25 20.00% 25.00% passed
Functional Area 2 120 5 12.00% 5.00% failed
Functional Area 3 220 15 22.00% 15.00% failed
Functional Area 4 210 30 21.00% 30.00% passed
Functional Area 5 250 25 25.00% 25.00% passed

As you can see in these two scenarios not only are there more tests on scenario 1, but also the distribution of tests is different as well so when we use the quality metric on the last day, they end up being completely different as shown in the table below.

Scenario 1 Scenario 2
Passed Failed No Run Quality Passed Failed No Run Quality
660 340 0 66% 80 20 0 80%

So if you are using a metric like this to track the in-flight quality of your application then beware of its failings. Even  if you are aware of them be aware your management and stakeholders probably don’t and are likely to be making decisions on these numbers which will most likely be wrong.

I am not even sure there is a way of measuring what the quality of your system will be before you have finished, and even then the result may be subjective, so you are better off in my option of just reporting the facts of your findings in words. by all means use number if you want to support those words, but words should be the play the lead role. Then let this information be the driver for your stakeholders to make the decisions on quality.

Checking Vs Testing – How this applies to Test Analysis

There have been a few posts by the great and the good of the testing world over the last few days about Testing vs. Checking, which was further broken down in to Human Checking and Machine Checking.

Before we go any further I thought I would give a little background about me. I am not a great or good. I am just a career tester making my way in this world. I would like to think however that I have gained a bit of valuable experience in the last 15 years that I have been in this industry. I take my profession seriously, am passionate about it, and I have an opinion on most things. Therefore I thought what the hell, I might as well make my opinions/thoughts and views known on this particular topic for my first (and maybe last) blog post. As this is my first ever post, and I am not the most prolific writer in the world, I apologies in advance if this post is either poorly written or a little bit choppy in its direction.

OK so back to the point. I want to make it clear that I am in support of the refined definitions of testing vs. checking that as been put forward by James Bach and Michael Bolton in the post Testing and Checking Refined. It makes a lot of sense to me and I have seen both human checkers and testers during my career. However, the one thing that stood out to me when I read this post, and others in response to it from the Iain McCowatt (Human and Machine Checking) and Paul Holland (Reply to Human and Machine Checking), was how the focus was on what happens during the execution of a check. There was no discussion or thought given to the process of defining that check in the first place from what I could tell.

If I have interpreted what these guys are saying, then checking, whether human or machine comes from a pre-scripted testing approach, as opposed to an exploratory one and as such this would mean that someone had spent time analysing the system to be tested prior to the execution and will have applied some knowledge to that process. Therefore could it be possible that although at the time of execution the script is a check, due to the process of the defining of that check a wider element of testing has been applied and as such makes the check at the time of execution stronger (or weaker if the tester doing the analysis has  not done well in this task)? or more correctly, the set of checks be stronger?

In my experience the analysis phase of testing is as much of a minefield as execution when it comes to the test vs. check debate. Test analysts can fall into a similar categorisation where they either “check” by identifying the requirements and then define a test to prove/disprove that requirement or they “test” the analysis done by the designers by analysing the requirements and then by using there knowledge are able to read between those requirements to identify questions or hypothesizes about what has not been explicitly stated in the design/requirements. If they have done the latter then they are adding more value to the set of checks that they produce than if they just apply the former. In my opinion this will make the list of checks to be executed stronger than than if they have just “ticked” of the requirements that had been presented to them. This leads me to believe that although checks (in the execution phase term) are only part of testing, there strength will be based on the type of test anaysis / Design that was performed during the generation of those checks.  .

So in conclusion I believe that applying the testing vs checking view at the test analysis phase means that at the time of execution not all checks are equal. You can have a strong list of checks or a weak list of checks depending. If you then have a human executing these checks, then as has been mentioned you will add further value.

This is my views that I wanted to put out there. As I stated at the beginning I am not one of the greats or goods, but I have views. I thought long and hard about putting this first blog out, can be very daunting to a newbie, putting your thoughts out there to be analysed and pulled apart, but even if no one agrees I hope that people will explain constructively why they feel I am mistaken or talking rubbish so that I can learn more. …. all I ask is be gentle 🙂