I first wrote the title of this post in the form “how small should the stories be?”, leading to the apparently obvious answer: “as small as possible”. It is in fact slightly more complex.
Let’s assume here that all your stories are dimensioned using story points. I personally like using “story points” for estimating features, as opposed to “ideal engineering time”, but this is a debate for another time. Let’s just start with story points.
If it all goes according to plan, you should have a whole range of stories, with sizes going from 1 to 21 (or 40 & 100 when you want to make clear that some stories are really big and need to be detailed more in the future). Let’s further assume that you know your current velocity (if this is your first sprint, Mike Cohn recommends either doing rough estimations in hours, or just let the team loose and see the results at the end of the sprint — see his book Agile Estimating & Planning for more on this).
Simply put, you want to be in a position to fit “many” stories into a single iteration.
Having “many” stories is important, because the more you have, the less risk you carry. Suppose that you have only 3. If things go just a little bit wrong, then you’ll miss your Sprint goals by a third (remember, you do not demonstrate or release partially implemented features, right? and testing is part of the implementation effort, right?). Missing objectives by 33% is a lot in the agile world. You appear to have done badly, despite the fact that you still might have complete a significant portion of the remaining code. Even if you have done 95% of the tasks, you still have only 66% of the features — supposing that you tackled the three features in order of priority. Yes, you will probably complete the remaining 33% in the following iteration, and more. But still, your client only got 66% of the features at the date you agreed on. Plus, your velocity will vary greatly between iterations, giving an impression of unevenness and unreliability.
So the strategy is simply to split your features into smaller functional chunks to get closer to “95% technical tasks implemented = 95% features implemented”. My rule of thumb is to aim for 10 features. Some will be bigger than others, but all will be rather small.
In practice, this probably mean that you have a mix of features of size 1, 2 and 3. I might accept one size 5-story occasionally, but I’d keep an eye on it while the iteration is going.
Interestingly, this means that velocity for most of the teams using that strategy will be between 10 and 50 story points per iteration. Reaching 50 points or implementing more than 20 distinct features might be a sign that you are ready to try shorter iterations.
Finally, note that this is, as always, only a step in the never-ending quest to Agile nirvana. Once you’re confortable with having many small features, the next level, besides shorter iterations, could be to force all features in an iteration to be of size 1. That would be a nice improvement. “Our velocity is 12” would always mean “we are implementing 12 features”. Also, it’d be much easier to select features during the Sprint Planning meeting. I wonder if someone has already been that far.