Monday, August 27, 2007

Is 30 minutes too long?

I am slowly adjusting to a new place of abode. It's a long story and not something relevant to this posting - perhaps later. However, as anyone who has moved to a new location, getting set-up as fast as you can becomes a necessity, and minimizing disruptions to daily routines, a priority. For me, the urgent needs all centered around getting our audio / video equipment all sorted out and this meant going to the mall. Pretty routine, I thought - nothing much could go wrong with a task this trivial!

Well, I experienced a crashed system first hand this weekend. My wife and I were at Best Buy purchasing DVDs when, after a wait in the check out line, the cash register simply crashed. I knew immediately it wasn't good when the cash register screen went black and I could follow the flow of NETBios data wrapping around on the screen. Eventually, the ubiquitous Windows icon appeared. But too late - the check-out clerk left his post for guidance and then took us to another line to stand and wait.

Our transaction was pretty routine, but did have a few twists. We had a discount coupon and we were paying by credit card. The coupon had been accepted, and the transaction amount approved. But at the time the cash register crashed - the transaction had completed its last step - printing the receipt. Where were we, process wise?

After standing in another line - this one moving a lot slower, of course, as the activity being performed wasn't routine for any of the participants - we were finally in front of another service assistant. We went through the story, physically pointed to the cash register and the check-out clerk where the crash had happened, and began the process again. We had to get the coupon reinstated and check their server logs (which they did provide us with) to make sure there was no duplicate credit card entries, before we were comfortable enough to proceed. By this time, 30 minutes had passed.

While the service assistant took an additional amount off the transaction - at no time did management approach us to say sorry for the inconvenience. Best Buy is a very successful operation these days - putting a lot of pressure on other operators like Circuit City - but to have customer-facing systems that are not reliable and where their operators simply dragged you off to another line to start over, isn't good for business.

Later that same day, we stepped into a branch of the Wells Fargo bank to update our profile information, add online access, and get a local ATM card so I wouldn't have to keep paying other parties ATM fees. All pretty straight forward. But as soon as we walked through the doors, we could see it was busy and that all the service staff were tied up. This time, however, the local manager saw us and came over to greet us. She offered us coffee and made the other staff aware of our presence. Yes, we had to wait - but we were nowhere near as agitated as we had just moments before.

When we sat down with a bank representative we learnt that all hadn't been well for the bank earlier in the week. They had taken a huge outage - with some elements of the network out of action for more than a day. I was stunned. Yes, I had been traveling during that time, and had missed most of the business news. He informed me that the mainframe failed and that it had taken a while to sort out. He then continued to walk me through all the steps I had to take, and was open and informative throughout the process. As we walked out of the bank - I checked and yes, the whole process took 30 minutes. But what a different 30 minutes.

In this day and age, I have little patience for any retailer or financial institution that skips on their infrastructure investments. And into my broad definition of infrastructure I include all the staff and management working with it - and all the folks involved with their education. Systems are failing a lot more than I had previously considered. Availability, and the need to be available through planned and unplanned outages, wasn't just the mantra of a select group of architects. The impact from a failure becomes obvious very quickly - whether it's a simple PC-based cash register or a whole mainframe-supported network.

But what a visible difference in professionalism between two corporations, separated by only a few feet. It boils down to the old fashioned customer service. Be 24 X 7 or invest more heavily in your human infrastructure.

A short while back I met with Tom Moylan - Tom manages the America's sales organization within HP that is responsible for selling NonStop servers. Right now Tom is as excited to be selling NonStop servers as he has ever been - and for good reason. A number of retailers and card processors had been considering alternatives to the NonStop server product line - but over the past few months, a number of the very biggest have come down decidely in support of NonStop.

There have been projects to move off of NonStop that were well under way; when the end users were polled, companies reversed course and reinvested in NonStop. There were internal competitions and "cook offs" between different cluster-based systems and when the finished product was presented, the winner was NonStop. Even when management had been cautiously supportive of NonStop but had let projects drag, there is now renewed enthusiasm for the platform.

If I was working up in one of the border states that loves ice hockey, then I would know who the big retailer was. If I knew much about card processors - credit or debit, it doesn't matter - then I wouldn't have to think too long or hard to figure it out. They have gone big-time with NonStop.

In the coming months I am going to revisit the topic of services, and of the impact they are having on the way we interface with our applications. In particular, I am very interested in the topic of Services-Oriented Architecture (SOA) and the impact that is having on the roll-out of new applications. The deployment of this architecture will play a very important role in the continued viability of NonStop, and I have already begun to see inroads being made into the NonStop customer base. This is an extremely encouraging development and bodes well for the continuing relevance of NonStop!

The number of corporations now very strongly recommitting to NonStop is just great to see, and those that are not had better invest in the customer service. Waiting 30 minutes is not that unpleasant if you are treated right - but being down and having to treat customers with lots of human interaction is a costly gamble on the part of any corporation.

Next time I see Tom Moylan I am pretty sure it will be my turn to buy lunch.


OZFooty said...

You do't have to wait to buy Tom lunch, You can always buy lunch for me :-)

Interesting experiences, and unfortunately I think that the first is becoming more common as shortcuts are being taken to get new systems into the market quickly.

These are stories that need to get put out into the marketplace. However corporations for obvious reasons don't want to publish these incidents. Though they could be benefit many corporations in knowing what to avoid and how to provide good ervice.

Sami said...

Richard, if you make your calendar publicly available, I'll make sure to avoid the places that you shop at -- I think you have bad infrastructure karma :)

I think making systems continuously available is one side of the coin but making systems available with some intelligence behind is equally important. Home Depot that's a mile from my home has self checkout (sometimes the only option). The other day, the sku for the item I was purchasing wasn't in the system and I spent 30 minutes trying to buy replacement parts for my weber grill (I could cook you and Tom steaks next time you're in Palo Alto area) and I was really close to vandalizing the .... self-checkout equipment.

Real-time infrastructure requires real-time availability (and continuous access) but without meaningful data and intelligence behind it, it is real-time frustration.

Aviator said...

30 Minutes is far too long, In my opinion. I would not have waited, and I have not waited anything more than 10 minutes coupled with a failed front-end system. I would have departed the store minus the merchandise and considered other options for purchase. This only serves outlets like in terms of reinforcing a customer base. Why endure the hassle when you can do it online and most likely entrust a portion of your transaction to a Tandem? Yes yes yes I said Tandem. Better to wait at home than stand in line with other grumpy people only getting redder in the face. -K

RT Writer said...
This comment has been removed by the author.
RT Writer said...

RT Writer said...
I have had an email exchange with Bill Highleyman and I thought it would be good to post here one of his comments:

Richard, I like your comments on "Is 30 minutes enough?" My series of Never Again articles focuses on just this issue - massive downtime due to stupid errors.
You might be interested in my August article in which I described eight major outages experienced by large enterprises in just the first six months of this year (
Also, the September article talks about a major hosting service that went down for days, taking with it hundreds of online stores (

Bill has now been publishing a newsletter "The Availability Digest" for some time - and it is well worth reading as he tracks some of these outages in more detail than I can cover in this blog. Bill can be reached at - Bill Highleyman []

September 17, 2007 3:58 PM