Thursday, 19 June 2014

Intersystems Cache Indexing Shocker - How to Fix Sluggish Queries

Photo: Markus Spiske / www.temporausch.com / CC-BY

When going through a performance review of a grid on the website, I was struggling to see why the query was taking so long. It was a single table query with a single where clause which was on an indexed field. This indexed field could only contain 5 different values and the query was return fewer than 100 entries but was taking over 1 second.

On other lower powered servers the query was taking less than a third of a second and often returning more data albeit on a smaller total data set.

Checking the query plan revealed something interesting, the query was not using the index but was instead performing a whole table scan. Further testing revealed that only "=" queries use indexes. "!=" ignore indexes as do any queries with "OR" or "IN".

After further investigation I found that if you only return the ID then NOT, OR , IN queries do use indexes, however, if you return any other field including the field you are performing your where clause on you activate a full table scan

Uses index

SELECT ID FROM TABLE WHERE STATUS != 'Complete'
SELECT count(*) FROM TABLE WHERE STATUS != 'Complete'
SELECT * FROM TABLE WHERE STATUS = 'Complete'

Full table scan

SELECT STATUS FROM TABLE WHERE STATUS != 'Complete'
SELECT * FROM TABLE WHERE STATUS != 'Complete'
SELECT * FROM TABLE WHERE STATUS = 'Active' OR STATUS = 'Error'  OR STATUS = 'Cancelled'

As far as I am concerned this is unbelievable behaviour, which is still in the product as of the latest 2014 version of Intersystems Cache database!

At least it had revealed a 3 different ways of substantially improving query performance

1. Union Query 

If you know all of the data that can be contained in the field then you can perform a UNION query

SELECT * FROM TABLE WHERE STATUS = 'Active'
UNION
SELECT * FROM TABLE WHERE STATUS = 'Error'
UNION
SELECT * FROM TABLE WHERE STATUS = 'Cancelled'

This of course means more maintenance when adding a new status.

2. ID Only Select 

If you do not know the data then you can just return the ID and loop through the returned IDs to grab the data.
e.g.
SELECT ID FROM TABLE WHERE STATUS != 'Complete'
Loop...
SELECT * FROM TABLE WHERE ID=...

3. Create New Property

e.g.
SELECT * FROM TABLE WHERE Active = '1'

This requires a new property which is probably easiest to create in the OnUpdate method, again maintenance of this would be an issue.

I have not seen any documentation talking about these issues with queries. It does strike me this should be an easy thing for Intersystems to fix at the database level, but if they haven't considered it by now then I won't be holding my breath for it.

Sunday, 8 June 2014

MVC Frameworks - CakePHP a quick Evaluation

A real project always throws up more issues than following the tutorials, I find that you not only tend to learn faster, but to a deeper level.

To try and evaluate CakePHP I made up a simple application. This was a simple car database to allow direct comparison of different makes and models comparing the following characteristics:

Price, Seats, 0-60 MPH time, MPG, Age, Road tax.

Within an hour of downloading CakePHP I had a working create, read, update, delete (CRUD) application. After 2 hours the filtering of the available fields to restrict the list to various required maximums or minimums and 20 cars details were manually entered. The application was essentially built working and tested to a basic level in 2 hours with 200 lines of code.

The real advantage though was those 2 hours were not hard pressed programming, it was a relaxed pace, and I am unsure without a framework I could get a CRUD application to the same level without a reasonable amount of planning time.

The framework does force you to work in a specific way, but brings in a number of advantages.

Less Code Required

Yes the framework overall will almost certainly have more than a basic CRUD application, but I only wrote 200 lines, which included all display code, I doubt that anyone could achieve similar functionality in the same amount of code (obviously unless they "cheat" and have multi-statement lines, single line loops etc...)

This brings the sub-benefits of:

  • Higher quality code - The less you need to write the easier it is to maintain concentration on higher quality easier to understand code.
  • Fewer bugs - Every extra line invites the possibility of error, either logical or even simple spelling mistakes.
  • Reduced maintenance - Fewer lines to read means more time spent fixing an issue and less time finding it.

More code reuse

The framework will contain methods and helpers for common coding issues, rather than re-writing these simple methods, or possibly worse writing lots of semi-duplicate methods throughout your code because you lack the appropriate planning / refactoring time.

Pattern Guidance

When you approach a new project you can tackle in a multitude of ways, a framework can help take a consistent and proven methodology to solving the issue and significantly reduce planning requirements.

Performance

While raw performance of applications may be considered an issue CakePHP has a number of built in performance indicators and actively helps you spot performance problems before they have impacted on your users. My application was perhaps too basic to demonstrate any potential scaling issues, but there is certainly plenty of control over the code to prevent performance being a likely blocker to the implementation of CakePHP in a project.

Sunday, 1 June 2014

Zend Framework VS CodeIgniter VS CakePHP VS Symfony Part 1

The annoying thing about programming is that the "right" way is not always obvious until you have done a lot of work on a project, sometimes design flaws only become obvious much later in production environments, when the real world starts to attack your lovely code.

The best way to avoid the wrong way is to try to arm yourself with as much knowledge about what is available as possible to reduce the chances of incorrect decisions. As such I have been looking into various PHP frameworks to try to establish which will be optimal for a project I am working on.

All four frameworks are designed to try and help you produce software as quickly and efficiently as possible. The goal is to assist you making the correct design decisions from the outset so that code is more reusable and easy to understand.

All the frameworks essentially rely on the Model View Controller (MVC) design pattern and the main learning curve to each is understanding how the Model, View and Controllers are created and plug into the frameworks.

In addition to providing the general "glue" for the MVC mechanism they also provide libraries primarily of helper functions, but in the case of Zend framework there are quite substantial areas of functionality such as Lucene search.

Prior to starting this investigation I had previously used both Zend Framework and CakePHP on some old projects so they perhaps have a bit of an unfair advantage as I am likely to find them easier to use and am more aware of the useful functionality which can take a while to uncover.

In a Nutshell

Zend Framework has the largest library of available functions, but at the same time the MVC layer is the least well defined, while it is common practice to use Doctrine as your model layer it is not explicitly required, in fact the library can easily be plugged into an exist project without using all of the general framework functionality.

CodeIgniter is certainly the lightest of the frameworks, is even has the flexibility available to only have a 2 layer pattern and not use MVC, although this would obviously not be advised. Its light weight nature leads to more code being written, and a lack of struts to quickly start up an application, but the opposite side of this is that the imposed overhead is minimal.

CakePHP appears to have the strongest struts in place, you can have a proto-type application up and running very quickly, but it does appear to have the least flexible structure and the helper libraries appear to be well defined, but do not provide the same range as Zend does.

Symfony seems to sit halfway between CakePHP and CodeIgniter in terms of available structure and functions.

What's in store for part two

Part two is going to be the most interesting part, sadly it will take a couple of weeks to put together. I aim to evaluate the raw performance of each framework with a basic application. Additionally I am going to try and establish which is the fastest to work in and which is the most code efficient.