Monday, February 29, 2016

Files, logs, and backups … just keep it all rolling!

It was a wild ride into New Orleans but we had to continue driving – there wasn’t any alternative as commitments had been made. How often do we make plans even today without fully exploring back-up options?

I have covered a lot of distance over the past couple of weeks and with February already drawing to a close these trips have included visits to Las Vegas and New Orleans. What we observed and how we handled the travel have been covered in previous posts, but it’s fair to say, both towns share a couple of things in common. They are entertainment capitals certainly, but they are also living on the edge – gambling is in evidence everywhere you turn.

A history steeped in riverboat gamblers is hard to hide in New Orleans even as the crime bosses inventiveness in Las Vegas too seems to still permeate its streets and yet, there’s a constant stream of visitors to both locations who seem focused on doing nothing other than gambling. And with risk come rewards, but only to a select few and even as we forked over a few dollars there was never any real expectation of winning big at the tables.

Recently I have spent a lot of time digesting much of what constitutes The Machine. While the implications for the NonStop community are still a fair way off, its influence however can be seen already. Just the acknowledgment that NonStop is a software offering and that in the labs NonStop is running in a virtual machine, gives credence to the fact that moves are being made to give enterprises the opportunity to extend the timeline for NonStop well into the future. A future we all know is beginning to rotate around The Machine. The catch phrase all those who attended 2014 HP Discover in Las Vegas heard was hard to miss as according to Martin Fink, EVP and HPE CTO, “all you need to know is that ‘electrons compute, photons communicate, and ions store.’”

Paging through the HPE web site, I came across the blog, Behind the Scenes, HPE Labs, where I came across a post published shortly after the announcement of The Machine, The Machine – HP Labs launches a bold new research initiative to transform the future of computing . A couple of sentences from the post caught my attention and without reposting too much from that post, here’s what I liked. “The Machine is a multi-year, multi-faceted program to fundamentally redesign computing to handle the enormous data flows of the future. It aims to reinvent computer architecture from the ground up, enabling a quantum leap in performance and efficiency while lowering costs and improving security.”

This was followed by quotes from Fink’s announcement, “In our photonics research, we’re using light to connect hundreds of racks in a low-latency, 3D fabric. And our work in Memristors points to the development of universal memory – memory that collapses the memory/storage hierarchy by fusing the two functions in one hyper-efficient package.”

The inclusion of Memristors in the announcement of The Machine shouldn’t have come as a surprise in hindsight, as HPE had been working with Memristors for some time albeit without too much success. But the promise of Memristors nevertheless is tantalizing if for no other reason than it changes the game when it comes to memory/storage. In short, there’s no hierarchy any more – all memory is the same and indeed there’s only one type of storage. Memory, as supported by Memristors. And the memory expands almost to infinity – you can now access almost unlimited amounts of memory without any of the latency overhead associated with lower layers of memory in the former storage hierarchy.

As much as The Machine is a game changer and a complete break from computer architectures of the past, should HPE pull off this enormous financial gamble, it will likely start appearing in data centers as soon as late 2019 with partial implementation in the form of The Machine ready operating systems and even processes. Moonshot, anyone? Recall how Fink said, “With Moonshot we’re creating system-on-a-chip packages that combine processors, memory, and connectivity,” so will some of The Machine make its first appearance as options for Moonshot servers? For the NonStop community that typically takes a long time to make changes to its systems, 2019 isn’t all that far away, so it’s the right time to ask questions about how it all will work for them.

How does this affect our understanding of computing? What about files and databases? What about the movement of files and indeed, moving files offsite? Surely, with The Machine there’s no expectation that we will simply rely on one instance of The Machine – good business governance surely dictates we have a second, and indeed potentially more than two sites running The Machine. Furthermore, if you look at the impact Big Data and Data Analytics are already having on transaction processing systems, including NonStop systems, we are still going to be pulling data and files in from other locations, many of which will not be residing on The Machine. Finally, regulatory authorities are still going to need the files even as other institutions mandate key business files are stored off site for many years.

One instance of the impact from The Machine is consideration given to simply replicating to a second instance of The Machine – reading log files, necessary for modern replication products, following the Change Data Capture (CDC) model – certainly it will run fast, but will the database log files really be in memory? Would a log file solely created in memory meet all of our log file requirements? If we elect to connect The Machine to other storage offerings, assuming that The Machine even supports such connectivity, although with the powerful transformation to hybrid infrastructure messages coming from HPE we can expect some relief on this front, so hopefully, the commercialization of The Machine takes this into account.

And what of moving data between The Machine and other systems performing Data Analytics – in theory, columnar databases that we associate with Big Data should have no problem utilizing the massive amounts of memory on The Machine but does all that extraneous and oftentimes unnecessary data need to be on The Machine?

“Rest assured, in the foreseeable future, it’s a safe bet that pulling select information from a volatile database is best handled by looking at the log files created at the time updates are made to the database. CDC has been a process well understood for a long time and has been the vehicle vendors, in the file and database replication business, have relied upon for years,” said Sami Akbay, Cofounder at WebAction, Inc.

“In recent times, this same process is being used to not just replicate files and databases, but as a valuable source to feed to data stream analytics processes, something Striim is now doing," said Akbay. "When you think of all that’s involved in turning petabytes into something meaningful and of use to online transaction processing, then connecting with the source via CDC is the only viable way to view data as it is being born. But will we be looking to do this to logs residing in memory on The Machine? I guess we will just have to wait and see but I am anticipating finding the data we need will be off platform for reasons apart from speed and reduced costs.”

However, Memristors still represent a gamble on the part of HPE and early iterations of The Machine may be introduced without Memristors. As the author of the blog post already referenced noted, The Machine represents “a multi-year, multi-faceted program” where a complete system will likely take time to be introduced  – certainly, The Machine that itself is a hybrid cannot be ruled out. There are certainly many in the NonStop community who will watch patiently from the sidelines for a while and I am expecting few within the NonStop to be early adopters. 


While the full capabilities of a complete The Machine may still be a pipe-dream for many, as Shawn Sabanayagam, Chairman and CEO, Tributary Systems, Inc., suggested recently, HPE faces challenges when it comes to presenting options for logs, files and databases and the best approach for taking backups. “Memory, unlimited or not, is volatile; providing memory, infinite in size and non-volatile (still very hard to envision), off system storage and backup will always have a need and a place, Shawn said. “One cannot archive data in memory – not practical and not possible.” And Shawn could have as easily added, it doesn’t even make sense.

For instance, explained Shawn, “Technology and methodology may change on how backup storage is created and managed but backup storage will not go away.” Among the reasons listed for its continued presence in the data center are the many issues with an all-memory approach. “Volatility, security, longevity, susceptibility to corruption, leakage and data loss,” Shawn suggested. And this was just for starters!

“What we really anticipate with the arrival of The Machine is just one more phase or step needing to be addressed by products, including our Storage Director. We have passed from just moving files to tape; then disk to tape; then it was disk to multiple targets (all policy based by pools of data); then from flash to disk and flash to tape to cloud; now, going forward to a full virtual instance (as in Amazon Elastic Compute Cloud - EC2, a web service) running in the cloud with a software defined data center model etc. etc. So the NonStop community can rest assured that this is all just a continuation of the need to satisfy user requirements in a manner best suited to the technology of the day.”

Gambling has been with us through the ages and irrespective of your take on the nature of gambling and whether it’s what we associate with technology corporations, the element of risk definitely applies. Looking out of our hotel window at the mighty Mississippi rolling by, as it has for eons, I couldn't help but observe the timelessness and persistence of nature and contrast it with the often dramatic changes that take place across the IT landscape.  

HPE has placed all of its chips on The Machine and it’s a square that is not the only square on the table. However, it’s also a solution that will be introduced over time where HPE takes incremental baby-steps along the way and where subtle shifts in the priorities of NonStop projects will see NonStop contributing to The Machine.

When it comes to configurations of The Machine capable of replacing current NonStop systems, there will be options, of course, but will they be all that different to what we have today? If I truly was a betting man I would have to say, the more things change the more NonStop may simply look the same!

No comments: