Tuesday, January 2, 2018

The folly that was Tandem Computers and the path that led me to NonStop ...

With the arrival of 2018 I am celebrating thirty years of association with NonStop and before that, Tandem Computers. And yes, a lot has changed but the fundamentals are still very much intact!

The arrival of 2018 has a lot of meaning for me, but perhaps nothing more significant than my journey with Tandem and later NonStop can be traced all the way back to 1988 – yes, some thirty years ago. But I am getting a little ahead of myself and there is much to tell before that eventful year came around. And a lot was happening well before 1988.

For nearly ten years I had really enjoyed working with Nixdorf Computers and before that, with The Computer Software Company (TCSC) out of Richmond Virginia. It was back in 1979 that I first heard about Nixdorf’s interests in acquiring TCSC which they eventually did and in so doing, thrust me headlong into a turbulent period where I was barely at home – flying to meetings after meetings in Europe and the US.

All those years ago there was a widely read publication called Datamation – a publication I swear was written, with some subterfuge, by IBM folks who wanted to communicate details about their projects to the rest of IBM. However, it was the advertisement on the very back page of that magazine that caught my eye – a picture of a Tandem Computer complete with Mackie Diagram showing what a truly fault tolerant system looked like. Datamation was my go to reading material while I flew so each time I stuffed the magazine back into the pocket of the seat in front of me, there was that Tandem advertisement.

So, imagine my surprise when I went to the Hannover Fair for the first time (before technology split from the primary event) only to see being constructed across the aisle from Nixdorf, a rather large exhibition booth housing a team from Tandem Computers. This was back in either 1983 or 1984, I cannot recall exactly. But at the time, fault tolerant computers were getting a lot of attention – IBM was reselling Stratus as their own System/88 and even Olivetti, a fierce rival of Nixdorf, was reselling Stratus.

A short time later and after the Hanover Fair wrapped up, Nixdorf revealed it was working with the New Jersey company building fault tolerant computers - Auragen Computers. The plan was for Nixdorf to rebadge and sell right alongside its other platforms, which at this time included a range of IBM Plug Compatible Mainframes (PCM) called the 8890. However, it didn’t go so well and at one point, it was reported that Nixdorf founder, Heinz Nixdorf, totally frustrated by the progress being made to create a fault tolerant system utilizing Unix, told an audience that the only thing fault tolerant about the system was the sales team!

As I recall those days, I am quite sure thoughts of Tandem were gradually entering my subconscious. But having worked with mainframes for such a long time, my colleagues thought it foolish of me to change course and to them, getting involved with Tandem was sheer folly! Which brings me back to the photo at the top of this post – a double magnum of a 1998 vintage of Lake’s Folly bought for me just as it was released!

It was a wedding present given to me by my wife of only a few weeks, Margo Holen. According to Max Lake, owner of the vineyard, the wine would need another twenty to twenty five years before it could be consumed. You see Max had pursued a highly successful career in medicine but reached a point where he too wanted to change course. As he announced his plans to open a winery, his friends called it foolish so of course, he named his new endeavor, Lake’s Folly!

It was in 1987, when living in Raleigh, North Carolina, where I was working for an SNA networking company Netlink, Inc. that I ran into folks from Tandem – people like Jerry Held, Andy Hall, Jeff Tonkel and Suri Harish. At the time, Tandem was interested in the SNA_Hub product that eventually found its way into the Tandem price book and featured in the Tandem Update publication. It struck me at the time just how innovative Tandem was and how well they were doing pursuing a completely different approach to solving business issues to do with maintaining uptime.

These folks that I came into contact with were Tandem evangelists of the highest order and over an adult beverage or two, Suri convinced me that it was time for me to consider joining Tandem. I couldn’t really do that in 1987 as there were visa issues so, returning to Sydney around this time in 1988, after almost three months of interviews with the local Tandem operation in Australia, I was given an opportunity to work for Tandem out of the Sydney offices.

The journey to Tandem had been circuitous – with stops in Richmond, VA, and Raleigh, NC, together with weeks spent in Germany as well as in Cupertino, CA. But the camaraderie and openness across the entire Tandem organization was unlike anything I had previously experienced.

It wasn’t simply the infamous beer busts or the swimming pool. It wasn’t the creativity behind the Tandem Television Network and the filming of First Friday. It wasn’t even the sight of such a sprawling campus that has now been ploughed under to make way for Apple. What it really came down to where a group of people who simply said that there was a better way to compute and that you could build these better ways to compute where no single point of failure could disrupt the processes - cool! The world could indeed be changed and yes, for a very long time, Tandem changed the world!

Long before Tandem became NonStop, the stars aligned in a way that was unique to the late 1970s – every financial institution wanted to own a network of ATMs but none of them wanted their ATMs directly connected to their mainframes. Traditional front-end-processors weren’t capable of running the software needed for payments processing so essentially, Tandem got its big start in life as an intelligent front-end. And they redefined the world of commercial front-end processing. I saw the impact that made on IT when I was at the Hannover Fair and I read about it so many times in the pages of Datamation.

Looking back at thirty years of association with Tandem and now, NonStop, that little spark of magic that saw the earliest Tandems carve out a niche as an intelligent front-end processor is about to be revisited. And in a way that could prove to have an even bigger impact on the marketplace than simply connecting to ATMs – blockchain. We hear the term “disruptive” so often nowadays that we no longer really react to it. Everything that makes it past the critical reviews of venture capitalists is disruptive in one way or another.

However, NonStop and blockchain is almost like a marriage made in heaven – if the distributed immutable ledger is to anchor commerce in the future, you don’t want to see copies of that ledger off-line for any period of time. True, the architecture really has an element of consensus about it but even so, when the underlying blockchain is used to support just a small finite group of entities – say just two or three nodes – then trust may indeed be jeopardized if it comes down to just two or even one working copy of the ledger.

NonStop with NonStop SQL underpinning blockchain has so much upside the likes of which I haven’t seen in a very long time – but is this just another foolish observation or wish? Is it a return to the days when jumping ship and joining Tandem was considered Buckle’s Folly? The jury will be out on this topic, debating the issue for quite some time, and yet I see all the signs that HPE may just have had good fortune fall right into its lap.

The year has only just begun but somehow, someway, after being around Tandem for thirty years I have a sense that, just as the first use-case scenario featuring blockchain on NonStop is being developed even as I blog – and yes, back down in Sydney as I understand it – a very positive chain reaction is likely to occur. And while I may not be able to track how this all pans out for the next thirty years, at least I will know that sometimes, folly can be a very good thing for not just me but for anyone in the NonStop community!

Happy New Year!    


capnpamma said...

HPE NonStop is still missing the education component which is critical to stepping out of niche markets. What they currently offer is "mechanic" courses and have allowed the concepts to fall by the wayside. As a result, often monolithic mainframe concepts are applied to HPENS resulting in poor designs and impossible to tune or scale applications. Systems support is left chasing "ghosts" within applications that present constantly changing non-linear performance behavior on a day-to-day basis.

The "old guard," like me (Tandem 1984 - 1994, Tandem world until 2015), is leaving the workforce, and I don't see a plethora of young, skilled Tandemites in the user market to replace them; nor, do I see a large interest within that same community to change their perceptions.

Look at what one sees at any xTUG, a group of already evangelized Tandemites listening to the same vendors and futures over and over again. On the other side of the coin, executives don't care what systems they are using or how many hours IT people must work to keep things moving smoothly. As long as there is a plethora of talent available for whatever systems they have, the rest is just noise. Selling TCO takes a lot more than numbers on a chart.

To me, the saddest thing is that the world is still catching up to NonStop fundamentals, but other vendors are doing a better job on the evangelical front. Until HPE solves this problem, it will continue to be an uphill climb and remain a niche player.

Richard Buckle said...

What I would like to emphasize here is that there are younger folks embracing NonStop - there is now an Under 40 SIG that is very well attended and if you happened to be at Boot Camp you will have seen a local University send a group of students. I think we need to be careful about bracketing all we have seen over the past decade and a half with what is now happening. Remember, too, HPE didn't bracket NonStop with other non-core software offerings preferring instead to keep and continue to invest in NonStop.

This is not to say I am ignoring you main point here as it is a good one. However, I think the end-game for NonStop is to make it far simpler (i.e. no need for hand-holding from experienced folks like you and me) to simply deploy any modern app in C/C++ Java and yes, node.js (SSJS) and have it run, NonStop. You see, to succeed going forward, we cannot have it being "labor intensive" to run NonStop ... what do you think?

Kev Collins (Retired) said...

Well written Richard.. I enjoyed the stroll down memory lane and recall meeting and working with you in the Sydney Tandem office in 1988. On the subject of education component and real understanding of the underlying principles, I have to agree with capnpamma !! Restrictions in the physical implementation of the architecture (ladder configurations) introduced in the CLX and Cyclone/R systems meant that Tandem only sold systems in multiples of TWO processors..i.e.2,4,6,8 etc to 16 in a "node". This concept was continued with the introduction of the S Series, again due to restrictions based on physical layout of the components (X servernet switch on "even" PMF, Y switch on "ODD" PMF).. Then when HP inherited the architecture, they continued to ONLY sell systems in EVEN numbers..although there were NO physical (OR SOFTWARE) constraints of supporting ODD cpu configurations.. Why do I raise this ?? Well, it seems that when a "non-believer" is presented with a "nonstop architecture" with pairs of processors, MOST seem to jump to the conclusion that one is simply backing up its mate, that's why they are sold in pairs...much like the well known (in HPE circles) Service Guard on their UNIX systems. I spent some time researching this with a number of well known distinguished technologists in the Palo Alto region, and all agreed that there was NO reason for not supporting ODD numbered processor configurations, it just seemed to be the way it was marketed.. I raise this now as I had a VERY interesting experience with a Tandem System at a computer show in Wellington (NZ) in late 1982. Rather than bring a two processor system across, with spares etc, we had the spares installed in the system as the third processor and left them there for the show. The effect of a THREE processor system was AMAZING ! Attendees would be strolling past, then suddenly stop, take a look and come over to ask WHY there were THREE processors ???. Obviously, it upset their preconceived idea that "one was backing up the other and therefore must be in pairs".. We had a great time explaining the TRUE architecture and the concept of nonstop process pairs and balance across the system with ANY number of processors.. The ODD processor support certainly causes a LOT of questions to be asked and breaks people OUT of their pre-conceived notion of how they think it works.. I have now left HPE after 38 years on the architecture, but I do recall MANY instances of people misunderstanding the architecture and believing it was just "service guard on steroids".. One of the BEST ways HPE can get some REAL discussion and understanding of this wonderful architecture is to start supporting ODD cpu configurations again.. It will certainly restart discussion and open the way to far better understanding of the architecture.. ODD cpu supported configs will also help granulation, especially in the smaller configs.

capnpamma said...

Guess I wasn't really clear on my point. You'd be surprised at how many times of heard of companies trying to figure out how to get off NonStop, but they are too entrenched to do it easily.

I was a fan of OSS and SQL/MX before they worked very well. Providing all these facilities are certainly important to NonStop, but I'm far more concerned about what I've seen in the user community from an application architectural standpoint, especially when trying to solve unavoidable batch applications on NonStop.

Unfortunately, NonStop architecture is not transparent to application architecture. I've seen applications that are "requester/server" in some form or another, but the requester hands out the work in a not-so-random manner resulting in high disk access as all the servers are hitting the same partition and "walking" across disk. Due to the inherent application design, it was not an easy fix to rectify with some sort of randomized pseudo-transaction. However, if the application architects had a true understanding of NonStop, each case would have been easy to avoid.

Anyway, several different times I worked with a couple of open-minded developers and by some clever (not brilliant) changes and a lot of work on the input driver(s), we achieved significant randomization and saw first cut performance increases of 400% - 700%. Even with working examples, others didn't grasp the basic principles or why they were important.

No matter what technical facilities HP provides, NonStop is architecturally different and expanding user understanding of this difference is critical to NonStop long-term success.

Alan Smith said...

Well written Richard

I joined the UK branch in 1981 (employee # 3000) NSII in the those days!!
Still working on the systems in a free lance educator today.

I do have to agree that HPE is not pushing the systems and their philosophy enough.
Probably we need better sales and education materials to spread the word as to just how good a system it is.

It is an industry leader.

Mirrored disks before any one had heard of RAID
Relational Database when that was still in the University stage an there was no SQL
Requestor/server decades before Client Server arrived.

Back in the early 70's HP dismissed the architecture as "something no-one want"

I fear they still do not understand, as capnpamma says.

I admire the way HPE has taken the architecture and used in-house (CLIMS) etc and industry standard products (Infiniband) to advance it, but they need to market it more aggressively and exploiting the advantages and capabilities.

Application design also has to take advantage of the architecture and is its capabilities. A straight forward "port" is not going to get the best from the exceptional system.

Richard Buckle said...

HPE continues to do well in finance and telco markets where the applications have been available some time - where next do you see potential for a major new and exciting application coming from and which industry / market will it serve?

Clearly, being able to run NonStop applications as another virtualized workload (but with levels of availability way beyond any other workload), is a step in the right direction - but will it be enough to entice something new being developed?

capnpamma said...

HPE is living off their laurels in finance and telco. I was on the GTE team back in the 80's and 90's when Tandem was pretty much the only game in town. As time went on, they were constantly trying to get rid of NonStop, but were trapped by portability and lack of alternatives. I also worked on the finance side, in particular, Raymond James, and a similar situation was prevalent. They want off, but can't figure out how to get there. However, in each case, they are slowly stripping away more and more applications from NonStop and leaving only legacy applications behind.

I've seen upgrades to NonStop systems to increase capacity/performance which simply weren't needed if the applications were developed correctly. Good for the HP salesman's pocket, but not good for NonStop's image.

I've heard the most ridiculous claims by technicians on fault-tolerance/fault-persistence because they really don't grasp all the issues that NonStop addresses. This lack of understanding and failure of HP to properly present it allows the user base to accept "good enough" rather than demand better.