About Jayesh Golatkar

I play the role of Product Manager at CAST. Passionate about Software systems. I have spent most of my career in designing, developing and implementing different types of Software systems. I like to experiment with new ideas, technologies and tools introduced in the market. Certified as Scrum Master, Lean Six Sigma Expert, and Project Manager

Software Risk Driven Application Development

Understanding Software Risks Created by Poor Application Development and Release Practices
While the conditions that drive software project managers, development teams and their leadership are often in the best interest of the company, they sometimes fail to recognize the software risks introduced to the business by these decisions or behaviors.  A review of the latest software risks affecting businesses illustrates that development organizations are notoriously poor at managing software development processes such as releases and evolutions.

Emerging Trends and Software Quality Assurance

The future challenges for Software Quality assurance (SQA) follow a few software trends, including:

Complex and large software packages
Integration with external components and interfaces
The need to deliver quickly
The need to deliver bug free software

The standard software quality activities defined by IEEE, such as verification and validation, are integrated into the software development cycle. We see dedicated SQA roles and resources in major organizations. Also, many multi-national companies are pushing to have a central team drive and manage the quality processes, methodologies, and tools across all their sites and teams.

6 Root Causes for Software Security Failures and How to Fix Them

Whether you move from an on-premise platform to a mobile device or a virtual cloud environment, security has always been the biggest concern. It’s no more shocking to hear about big banks, financial institutes, and large organizations shutting down their business or coming to a standstill due to an unexpected system crash, a security breach, or a virus attack.
Security outages are observed on all platforms. And it is becoming more and more challenging to detect and prevent such malicious intruders from getting into our complex multi-tier systems.

How to: tackle database performance issues

IT companies spend millions of dollars trying to recover losses incurred due to poor application performance. I am sure each one of us has complained about a machine or application being slow or even dead, and then spent time at the coffee machine waiting for the results of a long running query. How can we fix that?
Most of the business applications or systems are designed to retrieve and/or write information to a local hard disk or a database system.
Consider a typical multi-tier architecture. It will contain the client tier, web tier, application tier, and data tier as shown below.

The data tier represents the database and mainly acts as the storage/manager for business data. Usually when an end-user/client requests some information or executes a query on the client tier, he/she expects to have a response ASAP. However the client tier has to talk to the data tier in order to get back the appropriate information to the client. This might take a few microseconds or sometimes even a few hours depending on several parameters.
Common parameters responsible for such delays include:

Architecture of the system
Algorithm
Code complexity
Unoptimized SQL queries
Hardware (CPUs, RAM)
Number of users
Network traffic
Database size
Etc.

Out of all these parameters, unoptimized SQL queries contribute to the majority (around 60-70%) of database performance issues.
DATABASE OPTIMIZATION APPROACH
To avoid these delays, let’s look at some common database optimization approaches. There are three main approaches to go about optimizing databases:

Optimize the database server hardware and network usage. This involves changing the hardware to a specific configuration to speed up the read/write onto the database.
For example, use RAID 10 if there are equal read/write activities or RAID 5 if there are more read operations. This task is often performed as part of the deployment planning or infrastructure planning in the requirement analysis phase of the Software Development Lifecycle (SDLC). This exercise is also referred as hardware sizing.
Optimize the database design. This involves the normalization of database. For example, you can go up to the third normal form of normalization, which will definitely help to make the database quicker. Usually this task is carried out during the design phase of the SDLC.
Optimize the database queries. This involves studying the query plan, analyzing the queries for use of indexes, and joining and simplifying the queries for better performance. It is the most critical and effective approach for optimizing the database. The activity of query optimization can start in implementation phase and continue during testing, evolution, and maintenance phases.


In this post, I will focus on the database/SQL query optimization techniques/guidelines alone. The idea is to help tackle some of the critical database performance issues.
OPTIMIZE DATABASE QUERIES
Many databases come with a built-in optimizer, which performs optimization and helps improve the performance to a certain extent. However the results are not always promising. There are database monitoring tools which only capture information on the resources consumed by the database servers. This can help address 20% of the performance issues. The best way to go about query optimization is to review the functional areas, which take a long time to respond mainly because of the underlying SQL queries.
Below I have tried to list a few SQL rules with examples, based on experience and best practices, which can help optimize the database to a great extent.

Not using “WHERE” clause to filter the data will return all records and therefore make the system very slow
Example 1:
Original Query 1: select * from Production.TransactionHistory
Returns 113443 rows
Optimized Query 2: select * from Production.TransactionHistory where ProductID=712
Returns 2348 rows

As the number of records retrieved are less (using “where” clause) the query executes much faster.
 
Not using required column names in the “SELECT” part will take more time to return the same number of rows.
Example 2:
Original Query 1: select * from Production.TransactionHistory where ProductID=712
Returns 2348 rows
Optimized Query 2: select TransactionID, Quantity from Production.TransactionHistory where ProductID=712
Returns 2348 rows

Examples 1 & 2 might look quite obvious, but the idea is to think in a filtering mode and fetch the optimal set of data required for your application.
 
Using Cartesian joins that lead to Cartesian products kills performance, especially when large data sets are involved. A Cartesian join is a multiple-table query that does not explicitly state a join condition among the tables and results in a Cartesian product.
Example 3:
Query 1: select count(*) from Production.Product
Return 504 rows
Query 2: select count(*) from Production.TransactionHistory
Return 113443 rows
Query 3: select count(*) from Production.Product, Production.TransactionHistory
Return 57175272 rows (= 504 x 113443) -> Cartesian Product
————————————————————————–
Original Query 4: select P.ProductID,TH.Quantity from Production.Product P, Production.TransactionHistory TH
Return 57175272 rows
Optimized Query 5: select P.ProductID,TH.Quantity from Production.Product P, Production.TransactionHistory TH where P.ProductID = TH.ProductID
Return 113443 rows

 
Use Joins on indexed columns as much as possible.
Example 4:
Query 1: select P.ProductID,TH.Quantity from Production.Product P, Production.TransactionHistory TH where P.ProductID = TH.ProductID

Execute the query without any index for the first time. Re-run the same query after adding an index on column ProductID.
 
Avoid full table scans when dealing with larger tables. To prevent full table scans, we can add clustered indexes on the key columns with distinct values.
Example 5:
Query 1: select P.ProductID,PIn.LocationID from Production.Product P,Production.ProductInventory PIn where P.ProductID = PIn.ProductID
 

A. Execute the query without any indexes for the first time. It will table scan by default.
(Execution plan showing table scans for Product and ProductInventory tables below.)

B. Re-run the same query after adding clustered indexs on one of columns (LocationID).
(Execution plan showing clustered index scan on the table ProductInventory and table scan on table Product below.)

C. Re-run the same query after adding indexes on both columns ProductID and LocationID to avoid table scan.
(Execution Plan using index scan for both the tables Product and ProductInventory below.)

In some cases, where there are not many unique values, a table scan can be more efficient as indexes will not be used.
In general, subqueries tend to degrade database performance. In many cases, the alternate option is to use joins.
Example 6:
Non-correlated sub-query
Original Query 1: SELECT Name,ProductID FROM Production.Product WHERE ProductID NOT IN (SELECT ProductID FROM Production.TransactionHistory)
Correlated sub-query
Original Query 2: SELECT Name,ProductID FROM Production.Product P WHERE NOT EXISTS (SELECT ProductID FROM Production.TransactionHistory where P.ProductID=ProductID)
Replace sub-query by join
Optimized Query 3: SELECT Name,P.ProductID FROM Production.Product P LEFT OUTER JOIN Production.TransactionHistory TH On (P.ProductID = TH.ProductID) where TH.ProductID is NULL

 
Too many indexes on a single table are costly. Every index increases the time it takes to perform INSERTS, UPDATES and DELETES, so the number of indexes should not be too high.
There is an additional disk and memory cost involved with each index. The best solution is to monitor your system performance. If there is need to improve the performance, you can go for more indexes.
 
When a SQL query is executed in a loop, it will require several roundtrips between the client and the database server. This consumes network resources/bandwidth and hurts performance. Therefore SQL queries inside loops should be avoided. The recommended workaround is to create one query using a temporary table. In this case, only one network round trip will be required. And further optimize the query.
 
There are few database analyzers in the market which check the SQL code against such rules and help identifying the weak SQL queries.
I will continue to blog on this subject to cover advanced optimizing guidelines linked to Stored Procedures, Cursors, Views and Dynamic SQL. Hope this post gives you a few tips to identify and resolve database performance issues.
Please feel free to share your feedback/questions on this blog, or experiences with any tools you have tried for database optimization.

Moving your application to the cloud: Getting ready!

When we start talking about cloud, several common questions come to mind:
What do you mean by “cloud”?
What standard requirements need to be fulfilled before moving to the cloud?
Is my data secure on the cloud?
What about application quality?
Is it easy to push my application on the cloud?
I will be examining these questions and their answers in a series of posts around cloud.
The original goal for the cloud was to reduce the cost of IT infrastructure by allowing customers to utilize an infrastructure managed by a third party that contains physical and virtual machines, disk space for storage, and other resources remotely. This type of service model is termed Infrastructure as a Service (IaaS).
The cloud advanced to the next level by offering more than just the hardware. Cloud vendors started to offer the complete environment (for development and production) including the operating system, programming platform, databases, and web servers for hosting ‘N’ applications. This offering is called Platform as a Service (PaaS).
Today, we hear almost every article or blog referring to Software as a Service (SaaS) as the most popular cloud offering. In SaaS, the cloud vendors will install, host, and manage your software application on the cloud, and the end-users (referred as cloud users) can gain access to their application using specific cloud clients.
Moving to the cloud has its own risks and benefits like every new technology or innovation that is on the market. However, it will be interesting to understand what it takes for an application to be considered cloud-ready.
Based on my experience, I have put together the seven most crucial requirements for a cloud compatible application.

Multi-tenant architecture. Refers to a principle in software architecture where a single instance of the software runs on a server, serving multiple client organizations (tenants). This is achieved by ensuring a unique key for referring or accessing any record in the database. Everything is linked to this key, thus helping the cloud user have their own view of the intended application or service.
Sign in/sign out. Apart from the legal bindings, there has to be a definite workflow with an appropriate level of authentication for every user to securely sign in and sign out of the cloud environment or application, as we are uploading the user data into the cloud environment. The same holds true for signing off so that we do not store the active data of the user once they sign off.
Logging. Another important requirement is the ability to constantly monitor and log every action, transaction, and task performed in the cloud environment (mainly databases and application servers) by the user. There is also a need to have a load balancing and fail-over system in place in case we are dealing with crucial  applications.
Easy maintenance. The application must allow an easy and quick way to backup and restore the data in case of a crash or corruption. And even that backup plan needs a good maintenance routine to perform these cleaning tasks (including logs and disk space cleanup) for our application.
Hosting your application. If we plan to host our application on our own private cloud, we need a well-designed cloud landscape with the relevant servers, load balancers, firewalls, and proxy servers in place. Unless we decide to host our application on a public cloud, like Amazon, then we need not worry about infrastructure design details.
Data privacy and security. Assuming that the user will enter his data into our cloud environment, we need to ensure data privacy by fulfilling major security requirements around SQL injection, cross-site scripting, architecture, logins, and passwords to ensure that the user’s data is secured.
Data transfers. Whenever we upload or download data from the cloud, in the form of reports or other flat file exchanges, we must ensure the use of standard encoding or encryption techniques and perform these operations via secured FTP or HTTPS connections.

The standard requirements around usability, scalability, and performance also apply depending on the business needs or type of application.
If you plan to push your application or product from your local environment to a cloud infrastructure, you will need to fulfill the above requirements for your respective application. Please remember that these requirements will also have an impact on the quality of your application.
Do you think your application has what it takes to move to the cloud? Share your thoughts in a comment below.

Agile has replaced waterfall, but have quality outcomes changed?

The software industry is moving very quickly from the traditional waterfall model to the agile methodology. We’re certainly producing software more quickly, but is the software we’re producing any better? Before we get into that though, let’s look at the reasons for this shift in mindset from waterfall to agile.
Firstly, there are a few concerns with the waterfall approach which are re-emphasized time and again. They include:

The inability to adopt changes during the development phase because of initial scope freeze. (If the design phase has gone wrong, things can get very complicated in the implementation phase.)
Key decisions are taken with little knowledge of project and product.
Resource planning is not accurate as the full scope is not clear early in the cycle.
Critical performance and integration issues are identified only at the end of the release cycle, (and the cost of fixing a problem at the end is very high).
Working software is available only when testing is completed at the end of the release cycle.
The feedback from stakeholders and customers is received very late, resulting in features not meeting their expectations. (Feedback is available only during UAT, which is too late and expensive to implement.)
Deployment is possible only when all work is finished.
Usually quality is addressed very late in the cycle, resulting in poor delivery.

The agile methodology facilitates an easy way to receive recurring feedback from the customers early in the release cycle, and thus will have a positive impact on the overall quality of the product.
The feedback comes from intermediate releases or quality checks before going to production. It also comes from more tests, build cycles, and early dialogue with customers.

Figure A: Acceptance of agile and waterfall methodologies based on success rate
The above results are based on the analysis of functional quality more than the structural quality.
However, structural quality is an integral part of the software product or project. Using static analysis tools — which carry out the testing and validation of the software’s inner structure, source code, and design — we can detect major architectural issues or design flaws in time.
Based on my experience as a Scrum Master, I have seen that in enterprise ADM, it is not always easy to reconcile the agile method with architectural constraints placed on legacy system components.
Therefore, introducing static analysis checks can make a big difference. The true value lies in the ability to track the evolving architecture of an agile project and how that fits with the overall application landscape.
As a Scrum Master, I was always looking for a solution which ensured a complete quality assurance by:

Performing code and architectural reviews on the application or product being tested based on a defined set of rules.
Prioritizing issues based on their impact on the business areas, functional features, module, and code.
Giving a clear view on the key quality indicators such as security, performance, architecture, robustness, maintainability, transferability, and much more.

I couldn’t find anything close to what CAST offered. Therefore, I knew CAST was the right fit for me.