Blog Archives

Using visualisation to discuss problems based on where they surface

Kanban boardKanban is gradually getting known as a management (or change management) method, and is probably best known for the characteristic Kanban board.

The main point of the Kanban board in knowledge work, is to visualise “invisible” work. What exactly is invisible work, you may ask. In knowledge work, the answer is simple: Almost all of it. While work is in progress, it only exists as ideas in our heads or, at best, as lines of code in a computer somewhere. The Kanban board helps us visualise what we are doing, and how far along we’ve come.

That’s only sort of the topic for this post, though. More specifically, it’s about how we can use the board to discuss problems based on where they surface. All processes consist of what we call a “workflow”. A set of steps or stages required to complete the process. Each phase is named based on what we are doing or attempting to find out in that stage of the process. In a software development process, it could look something like this:

  1. Todo (work ready to be started)
  2. Analyzing (trying to figure what to solve and how)
  3. Developing (creating a solution)
  4. Reviewing (verifying that the solution is free of technical bugs or weak architecture)
  5. Acceptance testing (verifying that our solution fulfills the need of the customer)
  6. Production (work completed)

Say we have some piece of work, perhaps a new feature to develop. If a problem should occur with this feature, it is interesting to know how far along we got in the above flow when the problem was identified. It’s probably quite obvious that the more stages we’ve been through, the more work has been done. So the more stages we’ve been through before we identify that something is wrong with the result, the more work has been wasted, and the more serious we should be about figuring out why it happened. With me so far?

Using the above process as an example, finding a problem in the final stage is the least desirable. When the software has been deployed to production we’ve been through a lot of work, and it has also been made available to the customer. So any problem that is present has been exposed to the public. Depending on the nature of the problem, finding it this late in the process may also limit our ability to figure out what happened. If the problem is a technical bug, it’s easy to blame the closest stage related to technical stuff. In this case it’s stage 4, called “Reviewing” above. That stage was responsible for figuring out that the solution was technically sound. Let’s go yell at the guy who did the reviewing. Problem solved! Maybe.

Let’s say that we did find the problem in stage 4, so the reviewing actually worked as intended. The reviewer successfully identified a technical issue. Now what? Obviously someone in the “Developing” stage were sleeping on the job, right? Well, it could also be that the problem is related to insufficient analysis. The development done to implement the new feature may have been fine, but due to limited investigation of the rest of the system, a problem occurred elsewhere in the system. Software development is complicated stuff.

But what if the problem we found in production wasn’t technical in nature? What if the feature we released worked as designed, but it didn’t really solve the customer need? Then what? Obviously someone did a poor job during acceptance testing, but why was it wrong in the first place? The developer could have misunderstood the customer, thus solving the wrong thing. The reason could even be outside the entire process. The description of what to solve could have been wrong or incomplete all the way back in stage 1, when it was delivered to the development team. Note that even though the cause isn’t found within our immediate responsibility, we should actively assist those who can do anything about it.

Another interesting thing that may surface during such an analysis, is that the visualisation of the workflow is inaccurate. Suddenly you find that the problem happened in a stage somewhere in the middle, a stage that isn’t even on the board.

So what’s the point of all this? A useful map for a blame game? As you may have guessed, it’s not an exact science. It is however an interesting way to discuss what went wrong and why. And no, the point isn’t to find someone to blame, but to figure out how to get better. Any individual working in a process like the one above is likely to have good intentions. If something goes wrong while they’re “responsible” for a piece of work, the causes are usually a combination of many different things. A common misunderstanding when looking for root causes, is to be looking for the root cause. Look for as many causes (plural) as you can, and collaborate as a team to figure out what can be solved and improved both in the short and long run.

Happy hunting!

@TSigberg

(also published at www.revio.no)

The wrong question

(This is part #3 in my mini series of blog posts about estimates. Previous posts: part#1 part#2)

It’s time to let you in on The Secret. Don’t tell anyone, but “How long does it take” is the wrong question.

I planned for this post to be about statistics and how to calculate probabilites. Instead it turned into a rant, of sorts. You see, there’s another Secret buried here: If you really take the time to learn how to estimate risk correctly, this will help you learn the following fact about the outcome of your project: It is very uncertain. I guess that’s helpful. Sort of.

“That’s just baloney,” I can hear some of you say. “My projects are consistently on time and on budget,” you continue. Maybe so. Let’s examine the famous “project triangle” for a moment.

405px-Project-triangle-en.svg

Ever wondered why it isn’t a square? Because quality is a result of the other three, some might say. “That’s just baloney,” you can hear me say. I think it’s because cost, scope and time is easy to measure and easy to adjust. Quality? We always deliver perfect quality! We are professionals! Let’s just put that in the middle, and hope no one pays too much attention.

You, the guy who screamed “We always deliver on time and on budget!” a moment ago. How about scope? “Hah! We delivered all the functionality as well! All the developers bitched about them having too much to do in too little time, but my project plan showed them wrong!”

That’s impressive. But even when we go to the painstaking length of figuring out the risk and uncertainty of a project PROPERLY – You know, with Statistics and stuff, not just, God forbid, guessing or anything – it still comes out as being very uncertain. So how come you can consistently deliver your projects on time, scope and budget? Your team don’t have much choice, do they? They have to reduce quality. Luckily for you, no one will notice until the project is over. And they couldn’t really measure it if they did. What IS quality anyway? Sounds like a made up word to me.

 

So what did I mean by “how long does it take” being the wrong question?

There are a couple of problems. First of all, what is this “it” that we are doing? When you start the project, odds are neither you or the customer knows what the end result is supposed to look like. You may think that you do, but you don’t.

If it is a waterfall project, the customer will say “This is what I want” while dropping the dreaded Complete Requirement Specification on your desk. Then you go away for three months with a team of developers to create whatever is in the specification. “Here is your product!” you exclaim with great enthusiasm when you come back. After the customer has tested it, she goes “Nope, that’s not it.”

If it’s an agile project, your team of developers only go away for maybe a couple of weeks at the time. Each time you come back and show a little bit of the product, the customer tests it and goes “Nope, that’s not it either. And where’s the rest?”

However you go about it, both you and the customer learn a lot DURING the project. Mostly what they don’t want. Needless to say, this process takes quite a bit longer than just doing stuff once. No one knows what done looks like until, well, you’re done. And please remember, we’re not BUILDING a product, we’re CREATING it. The first of its kind. Your developers aren’t packing meat in boxes along a conveyer belt, they’re freaking SCIENTISTS! So stop pretending you are a production facility. The product you’re creating has never been built before. If it has, please go and buy that instead.

And that’s not even half of it. What did you say were doing again? A project, right? Well, most of the time we’re not really creating projects, we’re creating PRODUCTS. The funny thing about products is that they don’t end when the project ends. That’s kind of when they start. In terms of cost, that means roughly 90% of the total cost of a product comes AFTER the initial release. Maintenance, bug fixing, additional development, paying back technical debt, people spending most of their day swearing over how horrible it is, and so on.

 

Here’s the Third Secret: The less time you spend on quality during the “project” phase, the higher the total cost of the product.

So whenever you pat yourself on the back for delivering “on time” (usually a random point in time decided by anything except the actual amount of work that needs to be done before that date), that probably means you have reduced the overall quality of the project, and increased the total cost of the product you delivered. That doesn’t really sound like back patting performance, if you ask me.

Did you think this series of posts would culiminate in How To Estimate Accurately In 3 Easy Steps? Well, it sort of didn’t. If you’re still looking for that guide, you should probably read “The Flaw of Averages: Why We Underestimate in the Face of Uncertainty” by Sam L. Savage.

I’m not sure it will make your estimates more accurate, but at least you’ll understand why.

 @TSigberg

 

 

 

We can do anything!

I recently had a heated argument with a couple of our developers. They were creating a new module in one of our systems, and were building the UI based on a screenshot from a designer.

Developer: “Here is what was auto-deployed to the development server last night. As you can see it is a somewhat working prototype.”

Me (After picking on several aspects of the design): “..so I guess we’re pretty far away from anything I can show the customer.”

Developer (groans): “As you can see, it’s perfect and doesn’t need any tweaks..”

Me: “Well, were you looking for feedback, or did you just want me to compliment you on your CSS skills?”

Developer: “I was after feedback, but I am allowed to be grumpy when I get it. So, if we change the stuff you mentioned, can we move on to the admin part after that?”

Me: “If we present it like this, we’ll be thrown under the bus. I need it to be as close to the design screenshot as possible. Now that I’m looking at it again, I’m actually worried even that won’t be good enough.”

Developer: “Seriously?! Well, God. I need a timeout. Give me two minutes to grab another coffee..”

After a lengthy discussion while looking at some state of the art, jaw dropping design templates online, we all realize that we haven’t been aiming high enough with the new design. I’ve been giving them a hard time for maybe thirty minutes, even though I know their job isn’t easy. We’re hard pressed for time, and I know I’ve said the functionality is more important. But I also know that the current design simply won’t cut it. I try to calm things down.

Me: “I know I’m being harsh here, but I needed to get the message across.”

Developer: “So, if we don’t even think the design from the designer is good enough, why do we spend time implementing it? I don’t even want to do it anymore after seeing how much better it can be done.”

Me (pointing at one of the crazy designs we checked out online): “I hear you, but can we even do anything remotely like that?”

Developer (looking straight at me): “We can do anything!”

 

“Most people fail in life not because they aim too high and miss,

but because they aim too low and hit.”

-Les Brown

@TSigberg

Exactly HOW likely is the most likely outcome?

(This is part #2 in my mini series of blog posts about estimates. Part #1 can be found here)

With the help of detailed analysis, an experienced developer may figure out the most likely outcome of a project. But exactly HOW likely is the most likely outcome?

In the previous post we were forced to guess the duration of a project, and ended up guessing the project would take 300 hours. As a side note, the result after detailed analysis will probably be quite close to the same number. This is known as anchoring. In order to keep each post relatively short, we will save that for another post.

Let’s assume you’re given time to do a detailed analysis of the proposed project, consider the risk and are provided with a “complete specification”. You spend maybe half a day thinking about it, jot down some detailed estimates, and add them all together. The total sum is 290 hours, and you figure that to be the most likely outcome of the project. Your assumption may be exactly wrong for any number of reasons, but for the sake of argument, let’s assume you’re actually correct.

You have now figured out that the project will most likely take around 290 hours. But exactly HOW likely is it that the estimate matches the actual end result? 60%? 80%? You may be surprised to learn that the actual number is probably quite a bit lower than that.

Image

Above illustration represents the actual outcome of 42 Norwegian projects from several different companies. (Magne Jørgensen / scienta.no)

Let’s say you have historical data from several past projects, telling you both your estimated, most likely outcome, as well as the actual outcome. If you put them in a graph with the x-axis representing the actual effort in percent of the estimated effort, and the y-axis representing the percentage of the total number of projects, it would probably look something like the graph above. Confused yet? Let me explain that again.

The highest bar in the above graph is the one marked “100” on the x-axis, and the value of that bar is around 33. That means that roughly 33% of the projects were completed on the estimated time. Since it’s the highest bar, being on time appears to be the most likely outcome. The next bar to the right, marked “125” on the x-axis, indicates that a little over 15% of the projects actually spent 125% of the estimated time, and so on.

So why do you care? Because it sheds light on what “most likely” really means. And it may not be what you think. Actually, our most likely outcome (hitting the estimate spot on) is NOT very likely to happen. Yes, it surely is more likely than any other individual result, but it is LESS likely to happen than all the other results combined. We actually have a whopping 67% likelyhood of NOT hitting our estimate, even though (due to some unknown miracle) our estimate is the most likely result of all possible results.

To make matters worse, the combined likelyhood of the bars to the LEFT of the 100% bar (representing actual outcomes that were lower than our estimate) is only about 7%. So we got 33% likelyhood of hitting our estimate, and 7% likelyhood of being below.

If you provide your boss with your “most likely” estimate, you will in this example leave him with a 60% chance of blowing the customers budget. 

I’ll leave you with some time to think that through. Part #3 of this mini series can be found here.

@TSigberg

What does that mean, between 200 and 400 hours? Just give me a number!

I know a whole bunch of developers in different companies and different businesses. What they all love most about their jobs, is when they are asked to produce an estimate. It’s the highlight of their week! 

You’ve had a creative discussion with your boss and perhaps a couple of representatives from the customer. It’s all fun and games, until your boss turns to you and asks The Question.

“So, how long does it take?”

Did he just do that? In front of the customer, no less? How on earth are you supposed to answer that reliably without any time to analyse the details? You venture a somewhat vague answer, even though you know it’s no use.

“Hm, perhaps somewhere between 200 and 400 hours?”

“What does that mean, between 200 and 400 hours? Just give me a number!”

“Oh, uhm, I guess around 300 then?” you reply desperately, automatically reaching for the average of the first two numbers you threw out there. Surely that can’t be too much off?

Any one number representing a possible future outcome of something, will never be anything except just that. One possible outcome. That means somewhere out there, you’ve got a whole bunch of other possible outcomes as well. So If you say it might take 200 hours, without the backing of additional data, the chance of it taking more or less is about 50% either way.

So you’re giving your boss a definite number based on your guess on the outcome of something, without any information about other just as likely outcomes. Does that sound like a useful number to you? I didn’t think so.

Estimates are usually needed to figure out wether to invest in something, and to set a budget. If you give me a number where it’s a 50% chance of me blowing my budget, then I’d say you’re not really helping me over here. It’s heads or tails wether I’m in trouble with the customer for overspending. So when I ask for an estimate, it’s implied that I need a number that isn’t very likely to be too low.

So if your boss (or your customer) insists on getting that one number, what to do? First we need to understand more about the problem of “just give me a number”. More on that in part #2 on this mini-series about estimating!

@TSigberg

Read part #2: “Exactly HOW likely is the most likely outcome?” 

I’m in the business of a little bit better

So, what’s it all about, this software business? Making money? Isn’t any business? Figure out how to bleed the customer of as much money as humanly possible, while doing as little as you can get away with. It’s a bit of an art, really.

It doesn’t matter if you’re buying software services for your company, or a carpenter to remodel your house. They strip you naked and hang you out to dry. That’s just the way modern business works, I guess.

Or is it?

At Revio we have a set of core values. One of them is Proud. By the end of the day, we need to be able to stand up straight, and be proud. Proud of who we are. Proud of what we accomplished.

Proud of how we have treated others, and proud of what we have delivered to you.

As much as we would like to be flawless, we are not. There are times when we look at the result of a project, a system update or some other deliverable, and must admit that it falls short. We ask ourselves if we can be proud of that delivery, and the answer is No, we can not.

I hold both myself and the rest of the team to high standards, so when that happens I feel really, really bad. Then our COO looks at me and says:

“Meanwhile in Africa..”

What he means is that sure, our server is down, our customer is furious, and it sucks. But while the customer may very well be furious, he’s not dead. He is not being killed in front of his wife and children in Libya, and luckily – neither are we.

When we’ve reminded ourselves that no matter how bad we screw up, most of the planet is still doing quite a bit worse than we are, it’s time to get back to work. Whatever was wrong must be put right, and whatever the customer is expecting, we must try to achieve.

Our way of conducting business may not be saving lives. However, by being honest, dependable and proud of what we do – we hope we are able to make yours at least a little bit better.

@TSigberg

The middle manager. The useless fat of any bloated organization

Image

The middle manager. The useless fat of any bloated organization. They delegate all their work to other people, and then they wander around aimlessy, doing nothing at all expect worry about what happens if the delegated work doesn’t get done in time.

Being useless fat can be both rewarding and fun, but most of the time it is a difficult job. Even so, it must be done. After all, stuff doesn’t get delegated by itself..

It’s an interesting and not entirely new question. What to do with middle management in a company that is agile, stuffed with self organizing teams that in turn are made up of super-cool and smart people that don’t really care very much for authority figures who try to tell them what to do. Do we really need them around?

In an “agile organization” (whatever that is), stuff still needs delegating. But perhaps you are delegating goals and projects, instead of tasks. Less time is spent reporting against imaginary project plans, more time is spent actually creating value for the customer. Sharing information, explaining goals. Discussing possible solutions. Developing said solutions, showing them to the customer early and often.

As an agile or lean manager, I’m not really the boss of someone else. I am simply responsible for other parts of the value delivering process. I am also responsible for optimizing the process itself. I also try to inspire everyone else to optimize their part of the process, whenever that is possible without hurting the process as a whole.

My job is to protect tech people from nosy sales people, help sales people understand difficult tech people, explain to owners how a little money now isn’t always better than a lot of money later, and that more quality actually equals less cost in the long run. If all of the above goes well, the result is surprisingly often that we deliver valuable and useful software to customers who don’t always know what they need, but always know where it hurts.

Basically it is understanding, sharing and aligning goals and mindsets. And that is a really fancy way of saying that you need to talk a lot. I don’t really like talking to people very much, so I guess that explains why I spend half the time worrying instead.

It may not be very effective, but at least worried people look busy. And looking busy is important when you are useless fat.

@TSigberg

SQL Index tuning 101 – a practical approach to indexing

Hey, what is this all about?

This is mainly a blog about management and leadership. But as my boss pointed out, I am a Chief Technical Officer. So for a change, here is a post with a technical focus. Rest assured that they will be few and far between. But for now, if you are a Pointy Haired Boss, please move along.

As it happens, I’ve been working as a database admin / architect a while back, and indexing is an interesting subject that is often ignored – and seldom explained. So with this post I’ll try to do something about that! Kudos if you make it all the way through. 🙂

Introduction / Prerequisites

It is assumed that you have basic database knowledge: You know what a database is, you know what a table is, and you know how to perform operations against that table (preferrably using T-SQL).

It is assumed that you already know how to create, modify and delete indexes using T-SQL or SQL Management studio.

This guide gives a basic introduction to indexes, but does not attempt to explain in great detail how things work or why. It is focused on practical, experience based suggestions on how to perform basic indexing of a database. Even at this brief level, understanding how this works will get a little complicated if you are not familiar with the concepts. This is why a surprisingly large percentage of developers know very little about this topic (shame on them).

If you insist on not understanding how this works, you can cheat and skip the difficult bits. I have marked the somewhat more complex parts with a red star ((red star)) – that means you are allowed to skip them if you want the quick version of this guide.

What is an index, anyway?

The explanation a human can understand

A simple analogy is to think of a database table as a book. This particular book contains one long list (a table is basically a list) that spans across all the pages in the book, and the list has several columns. An index on a database table serves the same purpose as the index in a book (but is built in a very clever way), and it is usually only related to one specific column in the list that our book contains. So if you want to find something in a specific column, the index will tell you (or the database engine) on which page or pages you can find it.

Say you have a column called “LastName”, and you search for “Andersen”. If you have an index on the column “LastName”, the database engine can ask the index for all the pages that contain “Andersen” in the “LastName” column. The index will conduct a very effective search and reply something like “2, 5, 231 and 299”. The DB engine would then load pages 2, 5, 231 and 299, scan through these, and return only the rows in the list where “Andersen” is present in the “LastName” column.

What if you don’t have an index? Then the database engine would have to scan through every single page in the entire book (table), looking at every single line, checking wether “Andersen” is in the “LastName” column or not. Needless to say, this takes quite some time relative to the index approach.

 

indexing2

A simplified illustration: The index you create on a column contains the actual data from the column you index, as well as a reference to the page where the entire data row can be found. 

The technical explanation (red star)

The index is not actually structured in a plain table like in the illustration above. An index on a SQL Server table is a copy of one or more columns of the table, but it is sorted / structured in a specific way. It is arranged in a B-tree. As a result, searching an index is very fast. Click on the link if you want to know what a B-tree is (not required to complete this guide)

What about clustered indexes, what is that? (red star)

You may only have one (1) clustered index per table. A clustered index is the column by which the actual data rows of the table is sorted. Let’s say you have a column “LastName”, and decide to add a clustered index on this column. Then this will not be a copy of the column (as would be the case with a regular non-clustered index), but the actual column in the table. As a result of the creation of the clustered index, the rows of the actual table will reorder and sort itself based on the column you selected (“LastName”). In the illustration above, LastName is obviously not the clustered index, as the data in the table is not sorted by that column. Just judging based on the data we see in the illustration above, both the ID, Created and AddressID column could be the clustered index – as they are all sorted. By default, SQL Server selects the primary key as the clustered index. This is often not such a good idea, especially if the primary key is a randomly generated id like a Guid. It may also often be the case that the primary key is just an internal ID, not actually used in queries by the system.

The best candidate for clustered indexes is a column that you often include in a filter when you are expecting a ranged result (more than one). Columns containing row creation date are often good candidates in data tables (containing records of some sort like orders or transactions), as you would often ask to return all records for the last hour, day or perhaps even month for reporting purposes. If the table is actually sorted according to creation date, such a filter would be very effective.

If you never (or rarely) perform ranged searches (a user table could be an example, unless you often filter by a linked column like customerID), the column most often used for single selects (like the ID) will be the best choice.

The advantage of the clustered index, is that it IS the table. So when you have found a match in a clustered index, you also have immediate acces to the entire data row. In a regular index, you only find a reference to the page that contains the data row, and you will also need to fetch that.

So basically, indexes are great! I should just index everything then, in order to get maximum speed?

I’m glad you asked. That reminds that we need to talk about something else before discussing how to index:

I know my table is slow, but I don’t understand why (red star)

What makes sql operations slow?

The more rows in your table, the slower all operations will get. Indexes (applied correctly) will speed up read operations. This is a good thing. Indexes also make every other operation slower (insert, update and delete). That’s not so good. So why does that happen? Remember I said that an index was basically a copy of the column you index (see illustration above)? That means every time you add an index, you actually increase the size of the table with size of the column you are indexing. This increases the disk storage required to store your database. Storage is quite cheap, but you also introduce another issue: You increase the number of columns that have to be modified when you do an update, insert or delete. Say you add an Index on the column “LastName” in the table above. When you do an insert, SQL Server not only has to populate the data into the actual table, it will also need to update the index. It may even need to reorganize the index, as the content of LastName in the new row you just inserted probably fits somewhere in the middle of the existing index. Needless to say, this makes the insert operation slower than it would have been without the index.

Why the size of your datatypes matter

All this talk about size reminds me of a related issue:  The size of your table row actually slow down read operations as well! Why? Because every page in our book (table) can only hold a set amount of data (8192 bytes for the geeks). That means that as we increase the number of columns (or the size of each column) in our table, we decrease the number of rows we can fit in each page.  That means at least ranged selects (selecting more than one row) will take longer, as they need to retrieve a higher number of pages (blocks of 8192 bytes) to get all the rows you want. This translates into more data reads, which takes longer.  So don’t use an int (4 bytes) when all you need is a bit (1 byte) or tinyint (1 byte). Also always use VARCHAR (variable size) instead of CHAR (fixed size), and don’t even get me started on GUID (16 bytes). Lastly, don’t add columns you don’t strictly need.

Okay, stop talking! Just tell me how to fix my slow tables!

Let’s start with how you DON’T fix it

  1. Don’t index columns that are never (or rarely) included in where clauses of the queries performed by your system.
    1. In a few specific cases you may also want to index columns that are rarely used in a where clause, say in the query for a monthly report that would take hours if you didn’t add the index
  2. Only index columns with high variability in the data content. That means you:
    1. Do not index bit columns
    2. Do not index columns containing things like a status (typically a small range of different numbers).
    3. Do not index columns containing stuff like gender (which you should have put in a bit column in the first place, so I didn’t have to put this in a separate rule!).
  3. Do not index columns that are only included in where clauses IN COMBINATION with other column(s) that you have already indexed, AND the filter on the other column(s) already narrows down the result significiantly. I know this is a long one, so I will include a reverse version in the “how-to” below.
  4. Do not index very small tables (say, less than 500 rows). They are either used so rarely that it doesn’t matter, or they are used so often that the entire table will always be in memory(RAM), and it will be superfast anyway. A full scan of the table will in practice be just as fast as an indexed search, so even if you add an index, SQL Server may not use it. Also; Most small tables often contain near-static data, and should probably be cached in the application.
  5. Some people (and some automatic indexing tools) will tell you that something called covering indexes is a good idea. I generally start a tuning session by locating any covering indexes, making a note of the columns they contain, and then deleting them. Covering indexes are used wrong 90% of the time, and only effective in specific cases (not covered by this guide, but if you insist, read here). Just trust me on this one. Forget about covering indexes – cases where they make a real difference are incredibly rare. Thank you. I will tell you what to do if you find one in an existing database below.
  6. Do not put a clustered index on a GUID column, it will seldom be the optimal choice.

Basic How-to for indexing and / or tuning any database

  1. Start by figuring out what to use as the clustered index (see above for more information about this). The entire table will need to be restructured (this will take time and will lock the table) if you decide to change this later. In SQL Azure it’s not even possible.
  2. Index any ID column that are (often) used in where clauses of the queries performed by your system
  3. Index any foreign key column that are (often) used in the where clauses or joins of the queries performed by your system
  4. Index any data column that are often used in where clauses of the queries performed by your system, typically in the context of users manually searching for data.
  5. When several columns are combined in the same where clause, you often only need to index the columns that narrows down the search the most (reverse of rule 3 under “Don’ts” above).
    1. Example: The system only allows you to search for users based on age if you also include first and last name. In this situation, indexing the age column probably won’t speed up the search at all, as the indexes on first and last name will already have narrowed down the possible hits to just a couple of rows.
  6. If you find an existing covering index, it is usually relatively easy to understand the purpose of the index. It will usually contain a column covered by rule 2 or 3 in this list. If it does, create a new, non-covering, non-clustered index on this column (if it doesn’t already exist), and delete the covering index. Job done.

Would it be too much to ask for an example?

Here is a very basic database with a few very basic tables, including an indication of how I would index them:

indexing1

 

That’s really all there is to it. If the database and the way you query it is reasonably structured – you now know enough to make even fairly large databases (with several million rows in the main tables) perform reasonably well.

When tuning an existing database, one could also use the sql profiler to find the hotspots and fix specific problems faster, but that is not covered by this guide. If the entire database is indexed using the above guidelines, you probably won’t have any major problems anyway.

Happy indexing!

We already do Scrum, but what is this thing called Kanban?

“In Scrum we <insert Scrum practice here>, would we still do that if we switched to Kanban?”

Occasionally I’m approached by people curious about Kanban, and more often than not they are already familiar with Scrum. They may have read some brief blog posts about Kanban, but are left wondering what it’s really about. What are the rules? What do we have to change? Can we still do Scrum? Help! So I decided to write a blog post looking into some of the stuff you would usually be doing in a Scrum team. Would you still be doing it if you were doing Kanban, or would there be an alternative approach to achieving the same goal?

“In Scrum we divide our project into sprints. I’ve heard that you don’t do that in Kanban?”
The answer to this is, it depends. Many teams find it useful to establish a regular delivery cadence to either test or production, while others deliver when it makes sense to do so. The general rule is, the more often you deliver value to the customer, the better. The biggest difference is that with Scrum, all the practices (planning, retrospectives, releases) are tightly coupled to the sprint. With Kanban it is 100% decoupled, and everything is optional.

“I heard that with Kanban, you don’t do estimates. How does that work?”
There are no rules in Kanban explicitly saying that you can’t do estimates. In many situations it will make sense to do some form of estimating up front, or during a project. If feasible in your setting, we would however prefer to focus on what is important, and the assumed cost of delay . What would it cost us either in direct cost or lost revenue to NOT do this task now? In 3 weeks? In 3 months?

“I attended a conference once, and some weirdo on stage claimed we shouldn’t prioritize items in our backlog?”
In a large project you may well have tens, or even hundreds of individual items in your backlog. For the sake of argument, let’s say you have 50 items. To follow Scrum, you need to order all 50 items according to priority. This is typically done several times during the project, maybe as often as once per sprint. Let’s say that you on average deliver 5 items per sprint. So in reality, for any given sprint it adds little (no) value to order anything below the top 5 items in the backlog. In Kanban we prefer to ask our product owner “What is the most important task(s) for you right now?”. Answering that question is usually easier (and faster) than ordering 50 items.

“In Scrum we have a Scrum Master. Is there such a thing as a Kanban Master?”
In Kanban we recognize the fact that change is hard. People naturally resist anything that threatens their identity. Asking people to change their professional roles would do just that. When implementing Kanban, everyone involved keeps their current roles and titles. There are no formal roles. A basic principle is “Start with what you do now”.

“In Kanban you apparently have a board to visualize tasks, like we usually have in scrum?”
Yes, but with one crucial difference. Scrum has a lot more formal “rules” than Kanban, but they’re missing one – and it’s an important one. Limit Work In Progress. In Kanban we define a workflow, and we limit the work allowed in each state. As a result, we implement a pull process. Whenever there is room for more work in a board column, more work can be pulled from an upstream state (typically to the left on the board). Limiting WIP and pulling work prevents overburdening, reduces multitasking, decreases lead time (the time it takeS from we start something until it is finished) and increases flow through the system. It will also quickly uncover flow problems. If an item is impedimented for some reason, it will quickly stop or limit the flow of the entire system.

(illustration below stolen from Henrik Knibergs excellent post One day In Kanban land)

onedayinkanbanland

“That answered some of my questions, but I’m still a bit confused. What are the RULES of Kanban?”
I’m not sure there is an answer to that. Kanban is a management and change management technique that require some time to fully understand. There are few explicit rules as to exactly what to do, but Kanban will help you figure out what works for you. There’s no one right way, the context of your situation is always unique. Does it make sense to always have sprints with a committed list of deliverables? Probably not. Does it make sense to have stand up meetings? Probably, but maybe not every day in any context. Does it make sense to have a retrospective after each delivery? Who knows what’s right for you? Only you.

“So are there any rules at all?”

We have what we call principles:

  1. Start with what you do now (No set rules or processes. Start from where you are, and improve from there)
  2. Agree to pursue incremental, evolutionary change (Make a small change, see if it is an improvement, then try another)
  3. Respect the current process, roles, responsibilities and titles (Don’t manage by fear and force, build trust and understanding. Respect persons and identities)
  4. Leadership at all levels (Delegate, build trust, encourage acts of leadership at all levels of your organization)

And six core practices:

  1. Vizualise the workflow (Without understanding what happens, it is hard to implement change)
  2. Limit work in progress (Implement a pull system, typically a kanban system)
  3. Manage flow (monitor and measure how work flows through the system, in order to detect problems and opportunities for improvement)
  4. Make policies explicit (Document how the system works, thereby creating a common basis for understanding and improvement)
  5. Implement feedback loops (Regular collaboration and review at both team and company level is important to facilitate continous improvement)
  6. Improve collaboratively, evolve experimentally (Involve and inspire everyone to suggest improvements, and experiment to validate theories)

If you are ready to dig deeper into how Kanban may help you and your organization, I suggest you start by reading: “Kanban” by David J. Anderson

Now you may proceed with your day, I wouldn’t want you to miss your daily Scrum!

@TSigberg

How to make sure projects run late

The main problem with sharing information, is that most of the time people just don’t listen very well.

“What do you mean the project is late?!”

“Boss, It was always going to be late. It was doomed from the start.”

“So why the hell didn’t you say so?”

“Actually, I did.”

Being a software developer isn’t easy. Working on projects that don’t deliver as expected are common. However, most software projects don’t end up late unexpectedly – they were bound to be late before they even started.

Let me share a few typical reasons:

  • The project manager allocates 8 hours of your time to the project each day, even though he knows all too well that you spend 2 and sometimes even 3 hours per day handling support, production issues and answering questions from management and sales. Some days you even have lunch, even though it’s against company policy. Now you’re 20-30% behind schedule already. Swell!
  • Someone set a deadline before anyone have a clue what to solve, and much less how. In reality, most deadlines don’t even signify an important event for the project or product. It’s just an artifical date set because the customer decided “It has to be done by December 1st!”. No, it really doesn’t.
  • You’re only allowed to spend an absolute minimum amount of time analyzing customer needs. And by all means, don’t attempt to figure out the best way to solve the problem. “The deadline is set already, we don’t have time to sit around thinking and wiggling our toes! Start DOING something, for heavens sake!”
  • Management ignores any concern or warning voiced by the development team, and the team accepts any assignment, however unreasonable. “But boss, this is never going to be done on time!” “Well, it has to be! You’ll just have to find a way!” “Oh, okay. I guess we’ll have to figure out a way to deliver everything you’ve promised on our behalf without reducing quality or functionality. I mean after all, we’re basically wizards over here.” No, I’m afraid you’re not.

Bosses and sales people insist that you try to deliver, however unrealistic the goal may be. “We can’t let the customer down!” In the long run, giving in to unrealistic expectations actually hurts the customer. Each time the development team takes on more than it can handle, it compromises either quality or time. More time spent on one customer, means less time spent on another. Delivering bad quality and bugs, means even less predictable availability later on.

In order to satisfy all our customers in the long run, we need to be as effective as we possibly can, and deliver consistent quality at a pace that the team can handle over time. That may mean some difficult discussions and difficult decisions. However, it is better to adjust the expectations of the customer now, than to disappoint him with an inferior product or a missed deadline later.

 

@TSigberg