Friday, July 28, 2017

Moving forward - transformation and virtualization makes testing of the business logic even more critical

When we think of virtualization and the coming of clouds and as we consider all that may be involved in transforming to these hybrid combinations incorporating the traditional with the very new, how often does the testing of our applications come to mind?

There have been times these past few days where events have reminded me of practices and disciplines that dominated our discussions in former times. I had the misfortune of breaking things and working with insurance companies and I was left without access to more modern methods of communications to the point where I was asked if I could possible find a fax machine so I could receive a fax.

It was in the early 1980s when the vendor who employed me back in Sydney, Australia, installed a fax machine in my office and I no longer had to take the long walk over to the telex machine where I would have then spend hours preparing a paper tape for transmission back to my head office in Richmond, Virginia. In many ways it was a sad occasion as I had really mastered the telex machine and yet it was progress, given how easy it became not only to transmit the written word, but pictures charts and graphs as well!

Fast forward to today and the power of the mobile phone is undeniable. We can communicate with anyone we want to, at any time, about anything at all. In a couple of recent conversations the talk has led to consideration of whether the mobile phone was about to fade from the scene, to be replaced by even more spectacular technology and whether or not we were entering essentially an era of magic. How else can you explain away the knowledge so many businesses have about everything we do? And yet, even with the most advanced forms of communication there will still be a need for apps to support inquiries as well as many different models used for purchases and other financial transactions.

Point is – we still write code and as much as AI continues to advance there remains a need for humans to remain very much involved in stringing together the logic that drives decisions for success. When we talk about clouds we talk about the elasticity of provisioning that addresses both the needs we have for data storage and business logic. But here’s the rub – we are working diligently to be able to store vast amounts of data even as we continue to write logic practically unchanged from how we did it in the past albeit, a lot more quickly of course.

Let me take you to an earlier time, decades ago. In fact, many decades ago, to when we first started coding the computers that marked the beginning of our adventure with computers. I was recruited by IBM on the campus of Sydney University at a time when I was becoming very bored with academic life. At the time I wasn’t really aware of the implications of my decision to participate in a series of tests the University sponsored but it was only a matter of months before I found myself on another campus; this time, it was the operations center for a steelworks in Wollongong, Australia.

The year I was recruited was the southern hemisphere’s summer of 1969 and my first day on the job was 1970, so effectively I have been looking at code for almost six decades. And the fundamentals haven’t changed, just the timeframes. Ambitions? Well, my first job was to develop applications in support of a new steelworks that was being built but along the way, I was tinkering with the operating system as for a period of time the IBM mainframes the steelworks purchased didn’t have enough main memory to run any IBM operating system so we pretty much came up with our own – just a couple of lines of IBM 360 assembler code together with a bunch of macros.

Timeframes? Well this is where the dramatic changes can be seen, perhaps more so than when it comes to chip power and Moore’s Law. I was writing just one application a year – perhaps a little bit more. I grabbed a coding pad, wrote assembler instructions for the logic I was pulling together to solve a business problem. Pages and pages of assembler code that was then submitted to the data entry folks who oftentimes took a week or more before they returned to me the coding pages along with a box of punched cards. I kept running these decks through the assembler until I got a clean assembly at which time I took the object deck and began to test.

As a matter of practice, we always left an addressable piece of storage (of about 100 to 250 bytes) that if my logic went awry, I could branch to, throw in a couple of correcting statements, and return to the mainline code. Ouch – yes, almost every production application was supported by a series of supplementary corrective card that steered the logic back to where it needed to be without having to reassemble the whole application, or worse, send the coding pages back to the data entry team.

Testing? For my applications, which supported what we called the “online application” I would often resort to booking solo time on the mainframe and dialing in “single cycle” so I could manually step through each instruction and watch the results via the console display lights that changed with the execution of each instruction. Productivity? Wow – I could debug my programs more quickly than others working with me who preferred to go home at the end of the day.The company had enough programmers to complete the implementation of the new application for the steelworks about to be commissioned so it seemed reasonable to function this way. Looking back at what we did all those years ago I am not surprised that applications often stopped but rather that any of them ran successfully at all!

Now let me fast forward to practices of today – attempting to develop and test applications and then ensure that they were maintained same way as we did all those decades ago is not only not possible but runs contrary to the always-on, always-connected 24 X 7 world we live in as we remain tethered to our mobile devices plugging away at the latest app. Languages and development frameworks have changed. We don’t simply write code, we pull code from multiple sources and practically assemble a program that in turn is just a part of an application designed to address a specific business need.

Providing defect-free applications at a fair cost, particularly when these applications have to accommodate today’s multi-vendor and hybrid environments even as they have to be aware of the many regulatory and compliance mandates for each industry needs something a whole lot more sophisticated than simple access to a system that can be set to single cycle!  And I was reminded of this only a few days ago when I had a conversation with folks at Paragon Application Systems. These are the folks who have developed the premier testing solution for the payments industry.

“It’s all about continuous integration, continuous delivery and yes, continuous testing,” I was told by Paragon CEO, Jim Perry. Integration, delivery and testing is a never ending cycle, for the life of the program and application, performed in a seamless manner whereby the state of the program or application is always current and correct. “The growth of our global economy has created payment systems that have grown too intricate and change too quickly for any organization to risk deployments without frequent, comprehensive regression testing. No company can hire enough people to manually perform the testing necessary in the time available within a release cycle. Automation of the software build and delivery cycle, as well as test execution and verification is required.”

Manually perform testing? Grown too intricate? For the NonStop community there has always been concerns about the business logic bringing a NonStop system to a halt. And for good reason! Fault tolerant systems have been designed to keep processing even when facing single points of failure, but oftentimes, poorly implemented and tested business logic can get in the way! Unfortunately it’s about to get a whole lot worse as testing not only has to ensure the application is defect free but that the underlying platform, now being virtualized, is configured in a way that NonStop applications can continue being NonStop.

We have virtualized networks and we have virtualized end points and this has helped considerably with automating our test processes but now the platform itself is being virtualized and this is a whole new ball game for many enterprises IT shops. And this makes the need to have something like Paragon on hand even more important – we have stopped manually checking anything these days so we cannot start now. In the coming months, as we continue to look at the transformation to hybrid IT and to virtualization and the software-defined everything I am planning on devoting more column inches to testing as all too soon our inability to thoroughly test what we are turning on in production could bring many a data center crashing down.

If as yet you haven’t looked at Paragon then you may want to visit the web site and download a couple of papers as I have to believe, for those of you in the NonStop community who are only vaguely familiar with how testing has changed, particularly when it comes to testing for payments solutions, it may very well be an opportunity to rethink just how comfortable we are with the processes we have in place today. And wonder too, how anything worked at all back in the days when it was all performed manually!

Tuesday, July 18, 2017

When things go horribly wrong …

How a few cents of wire lying unnoticed on the road can cripple a vehicle as large as an RV; we continue to value availability and it’s time to double down on the benefits of NonStop!

The most essential attribute of NonStop today is its fault tolerance capabilities. Availability is as highly valued as it has always been and yet, there are many parties advocating that it really isn’t an issue any longer. Push apps and data into the cloud – public or private, it matters little at this point – and the infrastructure on offer from cloud providers ensures your apps and indeed you data is protected and available 24 x 7. But is this really the situation and should CIOs contemplating a future for their IT centered on cloud computing be immune to the many ways apps and data can be taken offline?

Unintended consequences! We read a lot about such outcomes these days and it is a further reflection on just how complex our interdependencies have become. Push a button over here and suddenly way over there, something just stops working. They weren’t even on the same network, or were they? Throw malware onto a Windows server looking after building infrastructure and suddenly, the data on a mainframe is compromised – who knew that they shared a common LAN? Ouch – but it happened as we all know oh so well.

For the past two months, Margo and I have been fulltime RVers. That is, we are without a permanent address and have been living out of our company command center. We have driven to numerous events all of which have been covered in previous posts to this blog. Our travels have continued and this past week we headed down to Southern California to meet with a client and the trip took us through Las Vegas. In the heat of summer in the desserts of Nevada we hit temps exceeding 110F. Overnighting at our regular RV site, we found a collection of fluids pooling underneath the RV and sheer panic set in. After all, this is our home; what has happened?

It has turned out that unknowingly we had run over wire mesh that was completely invisible to the naked eye. But those strands of very thin wire managed to wrap themselves around the drive shaft of the RV where they became an efficient “weed whacker” – you know, those appliances we often see being used to trim hedges and lawn borders. In a matter of seconds our own drive shaft powered these thin wires such that the result was multiple shredded hydraulic lines and air hoses – who could have imagined such innocent strands of wire could be so disruptive or  that they could completely cripple a 15 plus ton coach in a matter of seconds. Yes, unintended consequences are everywhere and for the most part, lie outside any of our plans and procedures, where detection of the event comes too late.

It is exactly the same with all platforms and infrastructure, on-premise or in the cloud, or even hybrid combinations of both! If you don’t design for failure – even the most far-fetched – then you are destined for failure. It is as simple as that. In my time at Tandem Computers we often referred to an incident that led to Tandem systems always being side-vented and never top-vented. The reason for this was that, at an early demo of a NonStop system, coffee was accidentally spilt on top of the machine effectively stopping the NonStop. Now I am not sure of the authenticity of this event but would welcome anyone’s input as to the truth behind this but it does illustrate the value of experience.  Designers would immediately have caught on to the possibility that coffee would be spilt on a system the day it was being demoed but for Tandem engineers, it led to changes that exist to this day.

Experience has led to more observations which in turn have generated more actions and this is all part of the heritage of NonStop and in many respects, is part of the reason why there isn’t any competitors today to NonStop. You simply cannot imagine all of the unintended consequences and then document them in their entirety within the space of a two page business plan. But design them you must and as I look at how the platforms and infrastructure being hawked by vendors selling cloud computing today are dependent solely on the value proposition that comes with redundancy (which is all they ever point to), my head hits the table along with a not-too-subtle sigh in disbelief. Redundancy plays a part, of course, but just one part in negating potential outages but availability needs so much more. But at what cost?

The whole argument for cloud computing today revolves around greatly reduced IT costs – there is an elasticity of provisioning unlike anything we have experienced before but more importantly, given the virtualization that is happening behind the scenes, we can run many more clients on a cloud than was ever conceived as possible back when service bureaus and time-sharing options were being promoted to CIOs as the answer to keeping costs under control. With the greatly reduced costs came the equally important consideration of greatly reduced staff. And this is where the issue of unintended consequences really shows its face. Experience? Observations? Even plans and procedures? Who will be taking responsibility for ensuring the resultant implementations are fully prepared to accommodate elements that fail?

There is a very good reason why pilots run through check lists prior to take off, landings, changes of altitude, etc. Any time an action is to be taken there are procedures that must be followed. When I turn on the ignition of the RV, there is a check list that appears on the digital display and for the same reason as pilots have checklists – too many bad things can happen if you miss something and I have managed to inflict considerable damage to our RV through the years when I forgot to follow all the items on the checklist. And there are best practices in place today at every data center that have been developed over time based yet again on experience – so when next we talk about availability as we head to clouds, who is preparing the next generation of checklists?

It is pleasing to me to see the efforts that OmniPayments is putting into providing cloud computing based on NonStop. For the moment it is solely providing payments solutions  to select financial institutions but even now, the number of clients opting to run their OmniPayments on the basis of SaaS rather than investing in platforms and infrastructure themselves sends a very powerful message to the community. Don’t discount the value of NonStop as has been demonstrated through the ages – get to virtualized NonStop (vNS) as quickly as you can and go champion within your enterprise that yes, you now have the best possible solution that can survive even the strangest of unintended consequences. It’s just what NonStop was designed to do and it keeps on doing it.

You run on NonStop X so you will run on vNS. There is much that can go wrong with traditional physical systems just as there is much that can go wrong with clouds. Simply going for more clouds and leaving it to redundant banks of servers isn’t the safety net any enterprise should rely upon so take it to the next level. Let all you know how NonStop is taking its most prized attribute, availability, high and wide into the clouds! After all, these clouds are every bit as vulnerable to failure as any primitive hardware built in the past and NonStop knows failures when it encounters them and just doesn’t stop! 

Sunday, July 9, 2017

Growth is not optional; it is a must!

NonStop keeps on going no matter what system failures may arise – but is this enough? What follows here is purely speculative on my part but is worth raising with the NonStop community. And yes, any and every comment more than welcome …

Travelling around Colorado these past few weeks it’s so clear just how much growth has occurred. Lakes and reservoirs are full to overflowing – more than one state park we have visited had pathways closed due to local flooding – grasslands are standing tall and trees and bushes are a brilliant green everywhere you turn. Spring rains have continued into the summer with afternoons subject to intense thunderstorms, most days. I can recall that in the past such storms were forming at this time of the year but rarely did the rain reach the ground, but this year there have been more late afternoon storms than I can recall.

Living in a motor coach makes us a little susceptible to inclement weather but so far, we haven’t suffered anything more than a fright from an unexpected thunderclap. The rainfall that continues well into summer isn’t something we aren’t pleased to see of course but the growth these rains have helped produce has turned the Colorado front ranges greener than I have seen for a very long time. It may all be problematic later in summer if it all dries out as we have seen more than our fair share of wildfires with summer’s end but until then, this extended period of growth does a lot of good to the state. Any reader who has also seen photos posted to my Facebook and Twitter accounts may have already seen what I am talking about but just as a reminded, I have included one of the photos above.

For the past week I have been working with vendors on the next issue of NonStop Insider that should appear later this week. What really has struck me is the number of references to growth. Where will it come from? Does the business value proposition of NonStop remain as strong as it once was or will NonStop struggle to sustain double-digit growth year over year? The theme of this issue of NonStop Insider was transformation – you will see numerous references to transformation in the articles that were submitted – but does transformation lead to more sales? It’s questions like these that have come up more than just a couple of times this week and it made me rethink some of the answers I had previously provided to my clients after I had been asked this question.

The business value proposition is as real today as it ever has been – it’s all about availability after all. Out-of-the-box, supported by middleware and utilities that are all part of an integrated stack, from the metal to the user interface! From the perspective of any user developing an application, there is always concern about what will happen if something breaks and knowing that your application will continue to function even as all around it may fail is not something that can be lightly discounted. It’s really a very valuable attribute with an almost “holy grail” consideration about it – just talk to those now building their very first application and watch their reaction when you say you work with a platform that survives failure and just keeps on running. Like the famous “Energizer Bunny!”

However, for most of us, we had this all before. We know the value of NonStop but it’s a strange development environment with legacy tools and some very strange ways of doing things – what’s this about checkpointing? What’s this about redundant storage? Isn’t it all very expensive and don’t you have processors that simply don’t do anything until they are needed? Recently, I have heard just about everything being addressed except for the most important aspect of all – out-of-the-box, it just works! No, you don’t write NonStop programs, you simply let NonStop run the programs you write. You have a rich choice of languages and development environments – NonStop supports it all but with the addition of fault tolerance. It not only just works, but it keeps on working. The Energizer Bunny will eventually stop – its battery will run down. It may last a lot longer than other batteries, but as a power source, it will eventually fail. Not so with NonStop!

So, yes we have the susceptibility to failure covered. But growth? To paraphrase the Apollo space mission, for NonStop growth is not an option. In some respect we have to be very thankful that HPE has given NonStop every chance to build a larger population of users. There has never been serious consideration to discontinuing the NonStop program despite what rumors you may have heard – there are just too many blue-chip customers for HPE to turn them out onto the streets. As witnessed last year at HPE Discover, from the CEO on down, there is a strong appreciation for the value proposition NonStop brings for even the most fastidious of users. However, today’s HPE looks nothing like the company that existed just a few short years ago. Now HPE is looking to all of its products to start producing the type of growth any new company demands.

But here’s the rub; there is opportunity for growth with NonStop for sure but not likely in its present form. Surprised? Well you shouldn’t be. It’s been coming for a very long time – NonStop is going to wash over every product and every HPE system will contain some elements of NonStop as HPE looks to differentiate itself based on availability. A stretch? Dreaming? Perhaps this is taking it a little too far – but then again, is it? Imagine for a moment that any distribution of software HPE builds has a little of NonStop flowing through it, and applications running on HPE as a result just keep on running, would that of itself be the source of future growth for NonStop?

Stepping back for a moment, you will find nothing of this in any NonStop roadmap presentation. For now, the NonStop development team has so much on its plate and as fast as it is moving, there is still so much more to do. However, the judicial placement of a couple of knowledgeable developers within other projects and this could all change in a heartbeat. Yes, NonStop still contains a certain amount of special sauce but it is NonStop’s special sauce and it is NonStop development that has the recipe. Let a couple of chefs loose in other kitchens and stand back – NonStop is no longer just a product but a philosophy and that’s not diluting the business value proposition, to contrary, it certainly would create growth.

You just have to look at NonStop in an entirely different light. It’s not best practices, although best practices have always been a factor in having NonStop applications be as available as they are. Furthermore, it’s not rocket science as much as there are those who think you need a team of specialists to keep NonStop running non-stop. This fear of a graying population of lab-coat wearing engineers is just way over blown. Our graying population is retiring but guess what, there is a developing talent pool of much younger folks that I am not prepared to discount or suggest that they won’t cut it!

Earlier I used the phrase “NonStop is going to wash over every product” and it wasn’t by accident as this phrase too came up in discussions this week. Think of the incoming tide pushing further up the beach and spilling onto rock formations until the tide eventually covers everything. This is exactly one vision I have of NonStop and while I may be the only one predicting such a possibility, HPE has everything to gain in letting the NonStop tide roll in – indeed, let’s go one big step further. Let’s make NonStop open source! Let’s integrate NonStop with OpenStack. Let’s shake it all up – and lets just see who follows NonStop. I know that this highly problematic as well, but why not?

Enterprises will still want a managed software distribution as they continue to abhor the current model of fixes and patches arriving by the hour. Stability and predictability – a new release every summer is something they can handle, but not every hour. So, NonStop becomes a special distribution of OpenStack built to meet these requirements of enterprise IT execs. Think SUSIE, RedHat even Debian – supported distributions are important and have found markets. Put this down as another potential benefit that NonStop brings to the party – availability, scalability and yes, predictability!

In today’s transforming world of IT, there is no such thing as staying within the lines and keeping inside the box. It’s cliché but it’s also very true – to succeed think differently. While much of what I have written above will probably not come to pass even as it’s a stretch to ever think HPE would make NonStop open source, in order to grow and become the best software platform on the planet – HPE has to think of doing the unexpected! The dramatic! And I think it can do just that and it may be coming very soon. Move over Energizer Bunny, not only will NonStop keep on going on but will do so long after your bunny’s battery has died!