The market just crashed! Get CAST on the phone!

A few weeks ago, we wrote a blog post about the Securities and Exchange Commission’s Regulation SCI, aimed at more closely regulating the IT systems running the majority of the US stock exchanges and injecting investor confidence back into the market.
As mentioned in that previous post, we believe the proposed regulation does not go far enough to ensure that the integrity and structural quality of these trading platforms are properly managed — and the instability of these systems has been all too evident over the past week or so.
As leaders in software quality, the experts at CAST were called on by a number of leading news outlets for our comments on what we can learn from these outages, and how they could have been prevented.

Foretelling Facebook’s IPO Failure

I’m not one who believes in fortune tellers or those who claim to be able to predict the future. Heck, I don’t even read my horoscope and cringe whenever someone attempts to force it upon me. Only when my wife has attempted to read me my horoscope have I offered even as much as a polite “hmm.” Nevertheless there are many out there who swear by those who claim to be able to predict the future, especially in the financial industry.
And while there were those who predicted a rocky road for Facebook’s IPO, it is doubtful that anybody could have foreseen NASDAQ’s technical melt down that surrounded the Facebook IPO. While the stock price predictions for Facebook may be coming true, surely the technical issues that NASDAQ experienced on Facebook’s IPO day could not have been predicted…or could they?
Not in the Cards
As Scott Sellers points out over on Computerworld, it seemed like NASDAQ understood the kind of volume it would be facing and had taken the necessary precautions. He notes, “Exchange officials claimed that they had tested for all circumstances and for thousands of hours. I believe them.”
I believe them, too, but like we’ve said here many times before, and it’s a point to which Sellers alludes in his post, testing isn’t enough. As Sellers puts it, there needs to be “a resilient underlying infrastructure.” Functionality does not always mean structural quality, yet functionality is all that is needed to ensure that applications pass muster when tested. The functionality issues that might be found in an application are merely the tip of the proverbial iceberg that can potentially sink an application after it sails.
This is what, in all likelihood, happened to the NASDAQ on Facebook IPO day and will probably happen again. Why? Because application failures have happened before on numerous occasions and yet NASDAQ did not take heed from those who had gone (down) before them. Last year alone the London Stock Exchange, Euronext, Borsa Italiana (bought by the LSE in 2007) and the Australian Stock Exchange all suffered outages due to technical flaws.
Obviously there’s a lot more to keeping an exchange running than the functional testing can detect and on this point Sellers adeptly points to the CRASH study on application software health released in December. He notes that:
Exchanges are complex, high-performance systems that can be difficult to build, upgrade and debug. According to CAST Software, “an average mission critical application has just under 400,000 lines of code, 5,000 components, 1000 database tables and just under 1000 stored procedures.”
He later adds that, “Having a robust – and well-reviewed architecture nearly always results in a clear competitive advantage.”
Applying the Crystal Ball
Truth is, software failures like the ones experienced by NASDAQ and the other exchanges have become all too commonplace in all industries. Unless it affects a company’s finances directly – as the NASDAQ failure may have done by holding up trading of the Facebook IPO – we treat news of software failures as though they were inevitable and almost expected.  In NASDAQ’s case, however, there are now calls for investigations and answers about what happened.
In my book, that’s a good thing. After all, when exactly did we decide that software failure was an unavoidable part of business and an acceptable excuse to leave us hanging and waiting?
NASDAQ, the London Stock Exchange, Euronext…in fact, all exchanges and financial companies need to do a better job of assessing the structural quality of software before it is deployed rather than merely depending on functional or load testing after it is deployment ready. There’s no crystal ball needed here, just automated analysis and measurement, which is now readily available in the marketplace on a SaaS basis. Not doing structural analysis throughout the build is like waiting for an application to fall on its face and fall it will…faster than the share price of Facebook stock.

Load Testing Fails Facebook IPO for NASDAQ

Are you load testing? Great. Are you counting on load testing to protect your organization from a catastrophic system failure? Well, that’s not so great.
I bet if you surveyed enterprise application development teams around the world and asked them how they would ensure the elasticity and scalability of an enterprise application, they’d answer (in unison, I’d imagine): load testing. Do it early and often.
Well, that was the world view at NASDAQ before the Facebook (NASDAQ: FB) IPO — perhaps the highest profile IPO of the past five years. And we saw how well that worked out for them. NASDAQ has reserved some $40 million to repay its clients after they mishandled Facebook’s market debut. But it doesn’t look like it’ll end there. Investors and financial firms are claiming that NASDAQ’s mistake cost them upwards of $500 million. And with impending legal action on the way, who knows what the final tally will be.
If ever there was a case for the religious faithful (like me) of application quality assessment to evangelize that failure to test your entire system holistically has a high potential cost, here you go. NASDAQ has hard numbers on the table.
NASDAQ readily admitted to extensive load testing before the iconic Facebook IPO, and it was confident, based on load test results, that its system would hold up. It was wrong, because its test results were wrong. And the final result was an epic failure that will go down in IPO history (not to mention the ledger books of a few ticked-off investors).
Now, I’m not saying that you shouldn’t be load testing your applications. You should. But load testing is something that happens at the later stages of your application development process, and before the significant events you are anticipating might cause a linear or even exponential increase in capacity demands to your website or application.
However, load testing gets hung up in its prescriptive approach to user patterns. You create tests that assume prescribed scenarios which the application will have to deal with. Under this umbrella of testing, the unanticipated is never anticipated. You’re never testing the entire system. You are always testing a sequence of events.
But there is a way to test for the unexpected, and that’s (wait for it) application quality assessment and performance measurement. If NASDAQ had analyzed its entire system holistically in addition to traditional load testing, it would have (1) dramatically reduced the number of scenarios it needed to load test, and (2) figured out which type of user interaction could have broken the application when it was deployed.
Application quality assessment goes beyond the simple virtualization of large user loads. It drills into an application, down to the source code, to look for its weak points — usually where the data is flying about — and measures its adherence to architectural and coding standards. You wouldn’t test the structural integrity of a building by jamming it with the maximum number of occupants; you’d assess its foundations and its engineering.
NASDAQ could have drilled down into the source code of its system and identified and eliminated dangerous defects early on. This would have led to a resilient application that wouldn’t bomb when the IPO went live, and would have saved the company the $40 to $500 million we’re estimating it’s exposed to now for its defective application.
At the end of the day, quality assessment can succeed where load testing can, and has, failed. Had NASDAQ considered software quality analysis before Facebook had gone public, there’s a good chance it would still have $40 million burning a hole in its pocket. However, our friends at NASDAQ load tested how they thought users would be accessing their systems, then sat back in anticipation of arguably the most anticipated IPO in recent history. Little did they know they would be the ones making headlines the next day.

Did NASDAQ’s App Glitch Cause FB’s IPO Hitch?

Isn’t it ironic?
Facebook, the galactically popular social networking site that for so long has weathered friction regarding weaknesses in its software – particularly around security and privacy issues – may have seen its own IPO effort submarined by a software glitch in the NASDAQ stock exchange.
In reporting on NASDAQ’s response to the technical difficulties it encountered on Facebook’s IPO day, Bloomberg’s Nina Mehta writes:
Computer systems used to establish the opening price were overwhelmed by order cancellations and updates during the “biggest IPO cross in the history of mankind,” Nasdaq Chief Executive Officer Robert Greifeld, 54, said yesterday in a conference call with reporters. Nasdaq’s systems fell into a ‘loop’ that kept the second-largest U.S. stock venue operator from opening the shares on time following the $16 billion deal.
According to Mehta, the reason Greifeld gave for the issues with the IPO was “poor design in the software it uses for driving auctions in initial public offerings.”
One would think that if any exchange out there were to be free of poorly designed software it would be the Tech-heavy NASDAQ exchange. Apparently offering the top Tech companies in the industry, though, does not necessarily mean you run the best software the Tech industry has to offer.
Profile of a Failure
Truth is, software failures like the one experienced by NASDAQ have become quite commonplace lately; so much so that they’re practically met with a shoulder shrug and an “oh well.” We treat news of software failures as though they were inevitable and almost expected. Only when it affects finance – particularly the financial status of a marquis brand name like Facebook – do we step back and even offer so much as a “tsk, tsk, tsk” for the failure.

But why? When exactly did we decide that software failure was an unavoidable part of business, an acceptable excuse for possibly undermining the value of a highly touted IPO?
Facebook reached a high of $45 per share before it dropped back below its initial offering price of $38 per share. Whether the glitches at NASDAQ caused the per performance or whether you agree with Henry Blodget at Business Insider that they were just a convenient excuse for a poor showing, there is still no excuse for application software failure, especially since we know what causes it:

Business Blindspot: Regardless of the industry, most developers are not experts in their particular domain when they begin working for a company. It takes time to learn about the business, but most of the learning, unfortunately, comes only by correcting mistakes after the software has malfunctioned.
Inexperience with Technology: Mission critical business applications are a complex array of multiple computer languages and software platforms. Rather than being built on a single platform or in a single language, they tend to be mash ups of platforms, interfaces, business logic and data management that interact through middleware with enterprise resource systems and legacy applications. Additionally, in the case of some long-standing systems, developers often find themselves programming on top of archaic languages. It is rare to find a developer who knows “everything” when it comes to programming languages and those who don’t may make assumptions that result in software errors that lead to system outages, data corruption and security breaches.
Speed Kills: The pace of business over the past decade has increased exponentially. Things move so fast that software is practically obsolete by the time it’s installed. The break-neck speeds at which developers are asked to ply their craft often means software quality becomes a sacrificial lamb.
Old Code Complexities: A significant majority of software development builds upon existing code. Studies show that developers spend half their time or more trying to figure out what the “old code” did and how it can be modified for use in the current project. The more complex the code, the more time spent trying to unravel it…or not. In the interest of time (see “Speed Kills” above) complexity can also lead to “work arounds” leaving a high potential for mistakes.
Buyer Beware: Mergers and acquisitions are a fact of life in today’s business climate and most large applications from large “acquirers” are built using code from acquired companies. Unfortunately, the acquiring organization can’t control the quality of the software they are receiving and poor structural quality is not immediately visible to them.

A Comment on Facebook’s Status
NASDAQ may need to pay back $13 million to investors who should have received transaction executions but did not because of its software failures. Meanwhile, brokers around the world may lose $100 million repaying investors for mishandled orders. A quick, pre-deployment application of automated analysis and measurement to diagnose the structural quality and health issues within the application software used by NASDAQ or any company would have been a much better investment of time and money.
I guess this is one more reason to lobby for a “DISLIKE” button.

Hackers Aren’t Playing Around

The two Sony Playstation security breaches that affected more than 100 million account-holders over the past couple weeks (77 million in the first with another 26 million last week) and exposed their personal information to hackers is just the latest example of how software code vulnerabilities can lead to the failure of mission-critical applications.
But what is being done about it? The New York Times’ Nick Bilton suggests that people are just expecting the Feds to step in and regulate things and that even Congress thinks this is where things are heading. He quotes Connecticut Sen. Richard Blumenthal as saying, “There needs to be new legislation and new laws need to be adopted [to protect the public]. Companies need to be held accountable and need to pay significantly when private and confidential information is imperiled.”
To Bilton’s credit, his reaction to this statement is, “But how?” He also goes on to note, “Technology also has a way of advancing far ahead of the law.” So far ahead, in fact, that he relates a story told by privacy and copyright attorney Christina Gagnier of a case heard before the U.S. Supreme Court last year in which Chief Justice John G. Roberts, Jr., the highest judicial officer in the country, “asked how text messaging works. If two messages are sent simultaneously, he asked, does one get a ‘busy signal’?”
So obviously the Feds aren’t up to the challenge…but who is?
Under the Microscope
Whether it’s the Sony Playstation hack or the system outages at financial organizations over the last few months, including the ones at the London Stock Exchange and NASDAQ, it seems many of these breaches begin with some point of vulnerability within the software code. Some of these vulnerabilities exist in newly created code while others extend from existing code on top of which newer applications are built.
While companies should be doing more during the build process to locate areas of potential risk, most do little or nothing.
Studies have shown that 0.025% (that’s one-fourth of one-tenth of a percent) of the lines of code in anaverage enterprise application contain vulnerabilities. A subset that miniscule doesn’t seem worth worrying about. Trying to find 0.025% is worse odds than trying to find a needle in a haystack. But when you consider that a the average business application contains over 400,000 lines of code, that still leaves roughly ONE HUNDRED points of infiltration for potential hackers!
Still, there’s no way a company could find the right 100 lines of code by hand and you can’t fix what you can’t find.
Security is a Kind of Death
In his 1947 essay, “The Catastrophe of Success,” Tennessee Williams writes, “Security is a kind of death.”  This is true for application software.
As applications become more sophisticated, there is no way you can stop hacks with traditional security software. If points of vulnerability within the structure are not addressed during the build process, even the best security system will only tell you when someone or something has breached your structure; it won’t keep them out. The problem must be solved by examining the code before the application is deployed.
Automated software analysis provides the means to see the whole application and go beyond one developer’s view of things like input validation, which provides an easy entry for a hacker, or any business transaction that might fail on its own. Automated measurement of that analysis provides management the means to track, incentivize and ensure that security, stability and efficiency traps are not introduced either inadvertently or maliciously into the enterprise software. In this way, if you can see the potential threat, you can eliminate it before it becomes a future security problem.
The ultimate baseline for security, therefore, should be assessing the structural quality of the application software before it is deployed to find and then fix potential breach points. If companies do not take it upon themselves to do this, their application software will continue to be a playground for hackers.