Thursday, August 6, 2015

Redundancy! How many NonStop systems needed to bring merriment?

HP NonStop systems have embrace industry standards and yet, for many their fault tolerant properties don’t eliminate the need for redundant systems – a lesson learnt from years of experience as nothing can eliminate disasters and just plain bad luck!

In client newsletters and in posts to other blogs I have bemoaned the fact that even with additional cars in the garage there can be times when you simply don’t have a car to drive. Such was the case in July when even the company command center, the ever reliable RV, was checked in for a routine service. In the same week, the Corvette was diagnosed with having no brakes and had to wait for new pads and rotors, even as punctures took out both of the remaining cars. Yes, the “Trackhawk” Jeep SRT8 pictured above (with ultra-low-profile tires, sometimes a leak can be hard to detect but trust me, it was flat), as well as our Grand Tourer where a front tire simply rolled right off the rim were both victims to flat tires - "when it rains, it pours" as good friend Robert Rosen told me! Tires needed to be shipped in and in both cases their unusual sizes and construction demands meant a week of waiting.

The Holen – Buckle family was reduced to asking immediate family for help and they graciously responded. Much of this was covered in the post to our social blog of August 1, 2015, With places to go and plans in place, the wheels fell off … where you can read more about this predicament. This wasn’t the only instance where the issue of redundancy came up. Among the NonStop vendor community there are those vendors with the sizable inventory of NonStop systems. In some cases, there are systems (vital to these vendors’ customer support programs) that date back to the NonStop Himalaya K-Series systems. However, as with any business model, due consideration continues to be given to just how many systems need supporting and at how many locations?

For the NonStop community the absolute minimum of everything is two, but in recent times, when it comes to systems and indeed locations, this number has steadily risen to where three or more sites with many more systems is not uncommon. Indeed, even as the NonStop community acknowledges a consolidation among the NonStop user base following many years of aggressive M&A activity two sites, each fully replicated (with at least four sites), is not only not rare, it’s more common than we may think. The rise in popularity of Disaster / Recovery solutions in the last decade and the number of product offerings to choose from has certainly contributed to the increase in system numbers and sites.

It is well known throughout the NonStop community that the German luxury auto manufacturer has a pair of NonStop systems deployed at every manufacturing site on the planet and that this duplication of systems has been a cornerstone of the services their IT group provides the company. It’s clear that, even in the highly connected world we live in today, redundancy on this scale is appropriate. An extremely large configuration buried deep underground may be the image depicted in movies but should the site fail for any reason, there’s always the back-up somewhere, but even here, relying on just a single back-up, makes today’s CIOs extremely nervous.

Distribute pairs “everywhere” and make sure there’s distance between each pair with separate power and communications infrastructure may still see one location going offline. Disasters, whether natural or man-made, continue to occur with regular monotony.  However, when your business relies on dozens of locations, such granularity provided through redundancy keeps critical production lines operating. Furthermore, when consideration is given to what is a modern system and perhaps even more importantly, what is a modern data center, then redundancy is a major check list item CIOs would be inclined to check off with a big positive tick!

In a recent interview with IT Director of comForte, Patrick Eyrich, he talked how the sustained organic growth based on the partnership with HP, together with the latest inorganic growth following M&A activity, has made it very important for comForte to treat their systems as a whole, rather than as just a collection of isolated servers. Keeping operational this “whole”, even when individual systems and components may be offline, without affecting the support comForte provides, is a critical concern of their senior management. For more on this interview, check the recent post to the comForte blog, Following the sun ...

In an upcoming post to the comForte blog, Eyrich talks about having, “a pair of NonStop systems in Neuruppin, Germany, and a further pair in Berwyn, U.S, together with yet another pair of NonStop systems in Sydney, Australia.” Eyrich then adds that this is going to grow even bigger as “we have a new entry-level NonStop X system on order for delivery this year and in total, this will allow us to support NonStop OS versions from G, H, J and now L.” Redundancy is just that important and while it may add to the overall operational complexity, the upside certainly makes it worthwhile.

With the new NonStop X system on order, did comForte really need three NonStop Himalaya S-Series systems? “We were considering retiring one of our NonStop S-Series systems but then again, what would happen should our primary data center in Neuruppin totally fail due to some catastrophe? Disaster – Recovery (DR) is just as important as security,” acknowledged Eyrich. “Knowing that we have replicated NonStop systems running elsewhere at two locations outside of Germany, as we have with our data centers in Sydney and Berwyn, greatly reduces the fears of senior management, so yes, we will keep these three S-Series systems for some time to come.”

The systems included may span several generations but the issue of redundancy has more modern overtones than we may first think. In former times, buried deep in the back office, was the mainframe. For nearly two decades I sold software into the mainframe environment and very few sites had anything other than that single mainframe. Walking into an installation with two mainframes was a rarity and yet, for the majority of corporations, relying on tape back-ups was the sole recourse should disaster strike. But today, while we cover a lot of territory when it comes to availability, underpinning recovery is a fabric of redundancy my former colleagues could only have dreamed about – adding 32K of real memory to a pair of IBM 360/30 mainframes in 1970 represented an investment of $750K each!

No solutions or service provider would think of going into business today without redundancy especially when they are supporting mission critical applications. In a brief exchange with OmniPayments CEO, Yash Kapadia, he told me how “Redundancy anchors the manner in which we build out or data centers for development as well as support for those customers we support directly using our own systems.” While it’s received considerable publicity of late in blog posts and articles, Yash is pushing into cloud computing utilizing NonStop systems and in so doing, “we simply have to have more than one NonStop system to execute and with the introduction of the entry level NonStop X systems, we can progress by taking small, baby steps.”

As for being a recognizable attribute of a modern system, Yash also noted that for OmniPayments, “presenting an image to prospects of having modern systems at the core of our operations mandates we have redundancies almost everywhere we turn and being able to accommodate the addition of new customers, the introduction of new products and features, all while changing the operating system and the full stack that goes with it, simply isn’t possible without redundant systems and is clearly a highly visible hallmark of what a modern system today looks like.”

In the July, 2015, issue of the eNewsletter, Tandemworld, Yash wrote of how, “OmniPayments easily expands to provide additional functionality when needed and supplies complete security functions for every financial transaction handled. It will survive any single fault, requires no downtime for maintenance or upgrades, and supports a range of disaster recovery solutions”. And yes, OmniPayments is “now available on NonStop X”.

Before leaving the topic of redundancy, within HP there are some interesting projects – CloudLine to provide bare-bones, no-label Intel servers to those very large operations that buy servers by the thousands and don’t need any of the vendor support infrastructure most enterprises depend upon. Get out of my way; think Google, Amazon, Facebook, Yahoo! There’s a redundancy present on a scale unimaginable with many servers down (to be thrown away) at any given time.

Then there’s the Converged Data Center Infrastructure that looks to separate processing, storage and networking resources as it throws in layers of virtualization to suit rapid reconfiguration as it provisions resources to meet the needs of the day. In both situations, the redundancy involved suggests that at some point someone inside HP will have the bright idea to completely overhaul what we see today in NonStop and bury it deep beneath the OSs to produce a far more competitive solution – yes, NonStop lives but someday soon we may no longer recognize it!

Taking a car to the shop for repair can see a car lay idle for a day or so and for many of us, while it might be a nuisance; it’s not a circumstance that would see us rushing out to buy a second car. Just in case! For those of us in family situations that necessitate two cars, seeing them both sidelined can prove extremely inconvenient and isn’t something we would expect to see happen. And yet it does happen! When it comes to our systems, as every NonStop user can attest, redundancy simply is the way we think and without it, the availability story loses considerable credibility. How many systems do we truly need? As my father would state as the family gathered – the more the merrier! 

No comments: