Jacob J. Walker's Blog

Scholarly Thoughts, Research, and Journalism for Informal Peer Review

My Next Response to a Job Application Question, this time about Diagnosing System Performance Issues

without comments

As I recently posted, I’m applying for a job with CSIS to potentially be in the role of a Programmer/Analyst.  And I think several of the answers I’m giving to questions that they have may be valuable to others in understanding real life work in computer science.

Here is my response to the question:

Often when an application appears slow, a database programmer/analyst is tasked to analyze the underlying problem and make a determination if it is being caused by the database code or something else. Describe the tools, techniques and experience you have used from your own work history to diagnose a similar problem in the past.

Solving performance issues of a system starts first with gaining an understanding of the symptoms of the problem.   This first starts with talking with those who are being affected by the issue to determine what they are doing when they are experiencing the application going slow; for example, is it always on the same screen, or just when they run a certain query, or is it at a certain time of the day, or just seems random?   Is it only this one user, or many users? Based upon these initial answers, one can get an initial idea of what to investigate first.  For example, if many users are experiencing the issue on the same screen, then it is clearly something that is likely due to the database code on that screen.  If the problem seems to be related to the time of day that the database is accessed, with many users having the slowdown at that time of day, then it is more likely to be partly based upon the amount of traffic the database is receiving. But, if only one user is having the problem, then it may be something on their side, either with their machine or with what they are specifically doing.

Let’s look at the three most common scenarios:

Scenario 1 – Many Users Having the Same Slowdown on the Same Screen

In this scenario, the problem is most likely to be occurring somewhere in the code, and the goal is to narrow down where in the system the slowdown is coming from. Using an MVC model, the slowdown may be coming from the Model (database structure), View (such as the programming on the website), or Controller (queries, database views, or other programming that transforms data).  To determine which one of these areas is the most likely culprit, logging should occur in the different systems to try and find a pattern.  But, one must be careful with logging data, as this in and of itself can lower the performance of the database, and minor tests can be done on the live system to replicate the issue. At this point it will be important to duplicate the database into a sandboxed testing environment, so that further analysis and testing solutions can be done, without endangering the live system.

As an example of diagnostics I have done in this area, with the Twin Rivers Adult School SIS system, there was an issue with slow performance when teachers attempted to do their attendance.  Based upon the basic procedure from above, I was able to determine that the ASP programming done on the web interface was using an inefficient process of querying large amounts of data, and then using a loop to extract the specific data needed.   I worked with our vendor to have them use a narrower server-based query initially, to extract the data, thus not needing to use an inefficient loop. Another similar example came from our MS Access client also having slow data extraction, and it had a similar issue where the client-based Access query was much slower than using a native server-based T-SQL query. The proper use of indexes can also often improve performance.

Scenario 2 – Many Users Having the Same Slowdown at the Same Time of Day

In this scenario the most likely issue is one of traffic, with the most likely bottlenecks being either the bandwidth to the server or the storage devices on the server.  While the obvious method of solution is to increase the hardware capabilities, sometimes a more cost effective solution can be found by working to lower the amount of data that would flow through the network or accessed from the storage device(s).  For example, often using a server-side query instead of a client (or webserver) based query will reduce bandwidth usage, although at the expense of higher processing on the database server.  Although, when considering which solution is better (hardware or software), the total cost of each solution should be considered, including the wages and time required for either solution.

When testing for these bottlenecks, often a sandboxed testing environment will be helpful, and it is important to try and make this system as similar to the live system as possible, and be able to have the methods to duplicate or simulate the traffic patterns that are occurring on the live system.

Scenario 3 – A Single User Having Slowdowns with the System

In this scenario the most likely cause is on the client side, and may be beyond the scope of the programmer/analysts job to solve, or involve working with another help desk technician to get the problem solved.  A variety of issues could be causing the problem, from client bandwidth issues, user error, client software issues, and/or client hardware issues.  Of most concern in this scenario is the possibility that the end user has been infected with a virus or other malware, which could compromise the security of the database, and an analysis should be done to determine if the client computer, unbeknownst to the end user, has been downloading large quantities of data.  If this was the case, immediate discussion with my supervisor would be in order about the data breach.

Post Revisions:

Written by Jacob Walker

July 10th, 2013 at 8:10 am

Leave a Reply

%d bloggers like this: