Stephan Schwab

Software development and farm life

Archive for the ‘Software Development’ Category

Estimation creates silos and prevents teams from developing solutions

with one comment

Allow me to put you in a real life situation. The following is from an iteration planning meeting.

“We have prepared the following 20 stories for this iteration” says the analyst Sarah.

The iteration manager Joe speaking to the programmers and testers gathered in front of them adds “We want you to estimate them all so that we can determine what our commitment for this iteration will be”.

Everybody takes a seat and Sarah starts reading the first story. She explains what the story asks for and uses a wireframe to illustrate what the user experience person Hanna has already created.

“Are the scrollbars supposed to show up all the time?” Peter wants to know. Tom, another programmer, adds “we can simply make them hide, if we have enough space”.

“Yes, they should always be there. I have created this layout according to our enterprise design guide” responds Hanna.

“Ok, so we will figure out how to show them all the time” says Peter and Tom mumbles “we should show a scrollbar, if there is nothing to scroll …”

“I don’t know how to test for scrollbars” says Cindy who is a tester on the team. “If you are making them a requirement, then I would have to fail our acceptance tests if the scrollbars are not there.”

Sarah then asks the team to show their votes. Some of the team think the story is 3 points while others think it is just a 2. Cindy votes for 5 points as she think it will cost some extra time to figure out how to test for presence/absence of scrollbars in a web application.

Sarah has them vote again and as there are still a few 2s amongst the majority of the 3s, she makes the story a 3 point card.

“Ok. So next story” Sarah moves on to the next card. She repeats explaining what the story is about. This time the product owner Lisa explains a few additional details and the team votes again.

I see a few issues with the situation.

Silos within the same team

It appears that analysts, product owner and user experience person have formed a sub team and figured it out all on their own without much involvement of the other team members. That is basically a silo within the team and in consequence the team isn’t a team anymore.

On the other side the group of testers and programmers can be considered another silo. It may even be that they refer to themselves as the technical part of the team which is equally problematic.

There is no collaborative problem solving

The whole approach is very prescriptive. Testers and programmers are not asked to develop a solution to a problem. Instead the solution is being presented to them and they are simply asked for an estimate. If you look closer at what Cindy the tester said, you will also find that apparently the team estimates in time, as she is worried that it may take longer to figure out how to test for scrollbars.

In the situation presented here analysts, product owner and UX person have basically turned themselves into software developers although they have no experience in writing code of any sort. They view the software from the outside. The UX person is mostly concerned with how information is being presented and how the user interacts with it. The product owner may like Hanna’s user interface design but then the software is not the user interface. The UI is an important part but it is just the surface. The analysts probably believe they are doing a good job and are helping the programmers to create a lot of small and well defined stories so that their teammates can focus on writing the code without being too distracted with other things.

The technical side of the team has let the others seize their job

Why are they doing it this way?

It might be that the team is using Scrum or a modified version of it. The iteration manager Joe tells them at the beginning of the estimation session that he needs the estimates to determine the team’s commitment.

In my opinion there are several hidden issues here. The first one is that the team is expected to tell how long the work will take. They are using points but as you can learn from Cindy’s question she is worried about the time needed to research how to test for scrollbars. So somewhere there is a notion of estimating how long it’s going to take with this team.

The second issue is that they all assume everything is totally clear and can now be constructed. Based on that assumption the team has split into those who define what should be build and those who build it. The team has not come to the realization that during the so called construction the programmers actually discover a lot of additional detail nobody has ever thought about. But they don’t make these discoveries because they probably do not model in code. So it may further be that the programmers on the team are not experienced enough to practice TDD and modeling (think of Domain-Driven Development) well. The programmers like small stories that are easy to implement and just write the code to match 1:1 what the analysts have asked for.

The team is loosing out on a lot of opportunities. The programmers can give valuable feedback and find discrepancies when they truly model in code. If something is hard to program, takes a long time to code, then that should be understood as a message. The message means that the model isn’t right or maybe even that there isn’t a model in the first place. Unfortunately less experienced programmers don’t know much about modeling in code so they just work hard and don’t understand said message. So the issue never gets noticed.

Consequences of the described behavior

The prescriptive approach leads to poor quality code. That is a serious one but without outside help (meaning a coach) it is rarely detected until it is already too late. When it is the team’s ability to deliver and implement new features has already diminished quite a lot.

With the prescriptive approach people may think that all the analysis is already done. After all there are analysts on the team and they have figured out what needs to be done. But then what are the programmers good for? “Well, they have to code it” you may say. Yes, sure that’s what programmers do but good code is the one that you can modify easily, bend and twist it in many ways without breaking. And that same good code relies heavily on a good model.

Without a model expressed in code the code may easily become just a bunch of scripts to read and write simple integers and strings from/to the database. Such a code base usually shows a low number of true unit tests simply because there is not a lot of logic to test.

What creates prescriptive behavior

Imagine an organization where teams are expected to have a fully estimated backlog in order to determine the cost for the project. At first glance such an approach seems to be a good idea. The people who will do the work will provide the estimate and thus based on what they say the true cost of the work will be known.

But then is the true cost of the project really known? What about all the discoveries that will be made once a team of smart people starts solving a problem? It seems unlikely that a few analysts will be able to analyze a problem and create a good solution expressed in small story cards for a 6 months project within a few weeks. They may be able to create 500 story cards over a few weeks but I think that will be all based on early assumptions. If the problem can be solved in a few weeks, then there were no need to pay whole team over 6 months.

So to me it seems more like the attempt of predicting the future, create a plan and then manage to plan.

The fact that a team is asked to provide a fully estimated backlog – and a detailed one! – creates the prescriptive behavior and discourages the technical team members from developing a solution using the input from analysts, user experience person and product owner. In the end it should be no surprise, if the quality of the resulting “solution” is lower than expected. The team has been prevented from doing a good job.

Quality has been traded for false predictability and it is likely that this happened based on requests from the very same stakeholders who expect a high quality product.

But wasn’t one of the core ideas in Scrum to maintain a fully estimated backlog?

Yes, it is. But there is a big difference between having 500 stories that are very detailed and having maybe 50 ideas written as stories. There is also a difference between estimating how long the coding and testing may take and sizing the complexity of a story.

The cost of the typical project is usually fully known. It is simply the sum of all salaries, facility costs, etc. times the duration of the project in months. There is usually a budget made available too. So the money runs out after the budget has been spent. There is really no need to recalculate that. Good business people know that and the team doesn’t have to explain it to them.

What is much more important than to “calculate” cost is to build something of value. Something that can be used and in the end makes the stakeholders to want come back to the same people for extensions or with new ideas.

I once built a project management and collaboration tool called Caimito One Team. It was based on Scrum and of course there were a backlog. The tool only allows to estimate in points and in some parts of the user interface instead of a numeric value the words Trivial, Less Complex, Complex, Very Complex and Unknown are shown. The idea was to make it clear from the start that nobody should ever think about how long something may take. It is irrelevant. The tool will calculate velocity (yesterday’s weather) using the average of the last three iterations and for that it will use the estimates in complexity. The resulting number is useful to the team and in the planning screen the tool advises to fill up or not overcommit. The idea is simply to manage expectations without destroying opportunities for good analysis and design.

The team that create Caimito One Team has been using it in the sense of eating one’s own dogfood from very early on. There was never a need for very detailed story cards and “accurate” estimates. New story cards were merely ideas for features. Then closer to the iteration these ideas became analyzed further and split into smaller stories that were small enough to be sized up as being less complex.

Written by Stephan Schwab

June 27, 2011 at 9:27 pm

How to work with ATDD and micro-tests in different testing environments

leave a comment »

Based on conversations at a client organization this article is about the options a team has for performing tests using multiple testing environments.

Let’s say there is an organization where they use the following testing environments:

DEV environment

This environment is used for local development and for demos until the product is deemed production ready. The product then enters the IT environment.

IT (integration test) environment

In the IT environment it’s all about testing the integration with external services. Everything that is not an integral part of the application being tested. That may be web-services, connections to mainframe services, to some sort of data warehouse and even includes databases that are not under the control of the application development team.

The main purpose of the testing in the IT environment is to verify that all the interaction with those external systems is working properly. If no issues have been detected, the product moves to the ST environment.

ST (system test) environment

The ST environment differs from the IT environment in that the data available to the application is different. In the DEV and IT environments the data available may be stale, outdated and inconsistent due to a lot of activities by potentially faulty applications that all use shared systems offering the plethora of external services available in the enterprise.

The ST environment has good data that is regularly taken and updated from production. The purpose of testing in the ST environment is to see whether the application can handle the real data correctly. So the focus changes from integration to data. After no issues have been found the product moves to the final PT environment.

PT (performance test) environment

The application works well with all the external services it talks to and can also handle production data. But will the performance be acceptable? The purpose of testing in the PT environment is to put a huge load onto the application and see how it holds up.

Where to run Cucumber scenarios

We should run our Cucumber scenarios in all environments except the one for performance testing. Cucumber scenarios touch every piece of the application so it is likely that issues that may exist will show up.

Specification by example

When using the specification by example approach we don’t treat the Cucumber scenarios as tests to verify behavior. Instead we use the scenarios to document what the application does and are grateful that we also get a pretty good regression test suite for free (free as in beer that is).

But then we don’t test enough in IT and ST – do we?

It is true that by using specification by example our Cucumber scenarios will not cover each and every permutation of invalid input data neither each and every permutation of all the different paths through the code.

Cucumber is foremost a tool to specify what the application should do from a business perspective. It is a communication tool and we should use it to speak the same language between stakeholders and team. It helps to bring out the mental image everybody has in their head and put it in a less or non-ambiguous form. By using Cucumber we get tests for free but we don’t use Cucumber because we want to test.

If the only tool you have is a hammer, then everything looks like a nail.

A software development team has many different tools they can use. A good craftsman knows when to use one tool and not another and why. It is experience that lets him make the right choices. The differences between the tools may be very small and they may all seem to be very similar but to the experienced craftsman the purpose of each tool is very clear and he does not use them just so.

In the year 2011 it can be safely said that for all programming languages there is some kind of unit testing framework available. The purpose of these tools is two-fold. When writing new code these frameworks are used to craft code that is well designed and testable. This is achieved by writing the test before the code. Less code will be written, because one only writes enough code to pass the test. The test becomes the first user of the production code.

The second purpose of unit testing frameworks is actual testing. After the production code has been designed following the happy path testers and programmers look for things that could go wrong. There can be tests that send permutations of invalid data through the production code and prompt improvements to handle it. There are even ways to test timing and race conditions. The development community constantly finds new ways to use unit testing frameworks for more and more special cases.

Therefore some have started to no longer talk about unit testing but instead call these tests micro-tests to make clear that we are no longer interested in only testing a single class or other small unit of code.

What should be covered in micro-tests

Micro-tests should be used to do the bulk of the testing that can be automated. Some will just test a building block such as a validator class. Others will test all the code through several layers from the user interface down to the database. Again others will test communication to external systems like web-services or a mainframe integration.

But … How can I test a web-service in a micro-test?

Well… There are two ways.

You could write a micro-test that actually calls the real web-service offered by some test system. That seems to be a straight-forward approach and is pretty much in line what many pure testers may think should be done. But if you do that, then you depend on the external service. If you run all your micro-tests as part of your build, then you will not be able to build the deployable artifact in the case the external system is down. And it could be down for days. So that’s probably not the best solution.

Instead you define a boundary around your application and document that you are only going to test up to that boundary. The boundary is represented by a model for the web-service. To create the model you capture what the web-service has sent back for a given request and then use these captured responses in your micro-tests instead of calling the actual web-service.

Don’t pretend to have control over things that are not yours

This solves the dependency problem but it leaves you with an uncertainty. From the perspective of your future users your application is broken, if the web-service it relies upon is broken. That’s kind of unfair, as you have no control over that web-service.

But then is testing everything that your application relies upon part of your duty as the developer of said application? I think, if you made this your duty, then you were basically pretending you can control these other things. Unless you are really in charge this would be a lie and unprofessional to let others believe it.

Define and verify the runtime environment for your application

Many makers of physical products define the environment their products are made for. An example: Widget X is to be used in places where the temperature is between 10 and 40 degrees Celsius and a humidity of no more than 85%.

Why should software products be any different? It is totally normal to say “this program runs on Windows version X” and in that case the program will check the version of the operating system when started. If the program relies on a SQL server, then this will usually be checked too.

Why not simply check out the environment a bit further? Once you have a clear definition of the dependencies for your application and have created the boundary I mentioned above, then it should be easy to check that these constraints are met during startup. If they are not, the application would log or display errors and refuse to do anything else. As the developer of an application you cannot do anything, if the runtime environment is not right. But you can and should verify that environment. That is part of the quality of your product.

If it appears to be impractical to perform these checks each and every time the application starts, you can ship an environment checker program along with it. However, I think that if possible, you should always opt for checking the environment at application start. This will lead to less support calls of any kind.

“But then I cannot run micro-tests in IT, ST, PT. So my testing is not complete.”

No, you cannot run your micro-tests in these environments. You can run them only at the time your build scripts create the deployable artifact(s).

But then, if you application checks your environment at startup, then you do test. The application checks it environment every time. Something is broken? The application will not start and point out where the issue really is.

That way you can focus on testing the things you control but you still have acted responsibly by making sure that the application’s expectation for the runtime environment have been met. If it still breaks, the ball is in your court.

Written by Stephan Schwab

June 23, 2011 at 10:38 pm

Embrace complexity – Don’t try to architect it

leave a comment »

My article about Software Design has sparked some mixed feedback amongst the people who have read it early on. Part of the feedback was about what I said about the role of the software architect. I was saying that there is no need for a special software architect but instead it needs a team of enough senior developers who can come up with good design for the software.

Note that I may have changed the article about Software Design since writing this blog post.

I still think that the term architect in software does not really fit well. It seems to lead to a lot of misunderstandings and unnecessary tension in conversations. The software industry should get rid of that term.

Many people seem to think that an architect is responsible for the big picture and that it needs that role especially in larger enterprises to make sure that all the different systems and components work well together.

But then maybe that is just a human shortcoming. It appears that we humans are awfully bad at managing and dealing with complexity. We seem to like linear and easy to understand things and approaches to what we do. We also seem to like predictability and are not feeling really comfortable with – well … – complex systems that – even worse – self-organize and adapt.

That may be the driver behind the desire to have someone in charge for the big picture aka the overall architecture. Even more so in the context of large enterprises.

What other very big systems are out there

Let’s look around. Software systems are not really the biggest systems out there.

What about the electrical grid? There are essentially two big electric grids on this planet. The one that runs on 220/240V which spans from Europe to all places that can be reached over land. And the one that runs on 120V and spans all of the Americas. In both cases the grid spans multiple countries, each with its own independent government, and multiple operating companies, each with their own independent management. They all sell and buy electricity from each other and the whole thing is a network. Has there been an architect or a group of architects that designed this? Many people have certainly contributed over the years but other than using certain standards on the basis of we use it because it works I very much doubt that there is any central oversight over such a vast system. There are certainly attempts to control the structure, the architecture, but there is so much politics involved that it would probably not right to assume that someone is responsible for the big picture.

What about the European railway system? In Europe the same train can drive from one country to another. That has not been the case all the time. There have been many different railway systems with different technical specifications. Because there were a benefit in interconnecting the different railway organizations got together and agreed upon a standard so that trains can drive on all the tracks. Was there central oversight by an architect? I think it’s more likely that mutual economic interests brought companies together and made them collaborate.

What about about the Internet? Before the Internet got so popular there have been wide-area computer networks but they were expensive and access to these networks was difficult. It was common that you could only connect devices that were certified and allowed to be used. All that oversight and control crippled innovation and in the end these earlier networks are now just a fading memory. The Internet is based on the, in comparison, simplistic TCP/IP transport protocol and anyone can create a new application and a new application protocol. No centralized architecture body designed HTTP and nobody controls the main protocol that is used to move the majority of all content on the Internet.

Instead very early in the development of the Internet people came up with the idea of Request for Comments (RFC). RFCs are not standards defined and maintained by some standards body. They are simply memorandums that describe a technical concept/proposed standard and then people are free to pick it up and use it. If enough people like it, there will be many applications using it and because of its usefulness it becomes an industry standard. Common sense makes a network of people with similar interests adhere to such a “standard” described in an RFC. Not because someone forces them to or made it a rule.

Do we still need a Software Architect or even an Enterprise Software Architect?

What has worked well for the Internet with RFCs since 1969 might be a model for large enterprises and companies of all other sizes as well. Instead of being afraid of the inherent complexity of software based systems, it would be wise to embrace the complexity and let smart software and system designers figure it out amongst themselves. All they probably need is a platform to share similar to what happens with RFCs for the very big system The Internet.

Good ideas, good protocols, good systems, good libraries, etc. will be picked and become more popular while others will be superseded by smarter solutions. That’s how evolution in nature works as well. There is no architect involved either.

Written by Stephan Schwab

June 7, 2011 at 10:12 pm

Follow

Get every new post delivered to your Inbox.

Join 127 other followers