Microsoft Access: Queries

Provided by Allen Browne,  Created: June 2008.  Last update: March 2010.


Contents:

Is Null, not IsNull()

IIf(), not Nz()

domain aggregate functions

Criteria on calculated fields

Sorting on concatenated fields

Where versus Having

First versus Group By

Other optimizations

Common query hurdles

This article addresses mistakes people often make that yield poor query performance.

We assume you have set up relational tables, with primary keys, foreign keys, and indexes on the fields you search and sort on.

 

Use SQL rather than VBA

JET/ACE (the query engine in Access) uses Structured Query Language (SQL), as many databases do. JET can also call Visual Basic for Applications code (VBA.) This radically extends the power of JET, but it makes no sense to call VBA if SQL can do the job.

Is Null, not IsNull()

cross WHERE IsNull(Table1.Field1)
check WHERE (Table1.Field1 Is Null)

Is Null is native SQL.

IsNull() is a VBA function call.

There is never a valid reason to call IsNull() in a query, when SQL can evaluate it natively.

 

IIf(), not Nz()

cross SELECT Nz(Table1.Field1,0) AS Amount
check SELECT IIf(Table1.Field1 Is Null, 0, Table1.Field1) AS Amount

The Nz() function replaces Null with another value (usually a zero for numbers, or a zero-length string for text.) The new value is a Variant data type, and VBA tags it with a subtype: String, Long, Double, Date, or whatever.

This is great in VBA: a function can return different subtypes at different times. But in a query, a column can be only be ONE data type. JET therefore treats Variants as Text, since anything (numbers, dates, characters, ...) is valid in a Text column.

The visual clue that JET is treating the column as Text is the way it left-aligns. Numbers and dates display right-aligned.

If you expected a numeric or date column, you now have serious problems. Text fields are evaluated character-by-character. So 2 is greater than 19, because the first character (the 2) is greater than the first character of the other text (the 1 in 19.) Similarly, 4/1/2009 comes after 1/1/2010 in a Text column, because 4 comes after 1.

Alarm bells should ring as soon as you see a column left-aligned as Text, when you expected it handled numerically. Wrong records will be selected, and the sorting will be  nonsense.

You could use typecast the expression with another VBA function call, but a better solution would be to let JET do the work instead of calling VBA at all.

Instead of:
    Nz(MyField,0)
use:
    IIf(MyField Is Null, 0, MyField)

Yes: it's a little more typing, but the benefits are:

This principle applies not just to Nz(), but to any VBA function that returns a Variant. It's just that Nz() is the most common instance we see.

(Note: JET's IIf() is much more efficient than the similarly named function in VBA. The VBA one wastes time calculating both the True and False parts, and generates errors if either part does not work out (even if that part is not needed.) The JET IIf() does not have these problems.)

 

Domain aggregate functions

DLookup(), DSum(), etc are slow to execute. They involve VBA calls, Expression Service calls, and they waste resources (opening additional connections to the data file.) Particularly if JET must perform the operation on each row of a query, this really bogs things down.

A subquery will be considerably faster than a domain aggregate function. In most cases, a stacked query will be faster yet (i.e. another saved query that you include as a "table" in this query.)

There are times when a domain aggregate function is still the best solution you have (e.g. where you need editable results.) For those cases, it might help to use ELookup() instead of the built-in functions.

 


Craft your expressions to use indexes

The query will be much faster if the database can use an index to select records or sort them. Here are two examples.

cross WHERE Year(Table1.MyDate) = 2008
check WHERE (Table1.MyDate >= #1/1/2008#)
AND (Table1.MyDate < #1/1/2009#)

Criteria on calculated fields

In the example at right, the Year() function looks easier, but this will execute much slower.

For every record, JET makes a VBA function call, gets the result, and then scans the entire table to eliminate the records from other years. Without the function call, JET could use the index to instantaneously select the records for 2008. This will execute orders of magnitude faster.

(You could use WHERE Table1.MyDate Between #1/1/2008# And #12/31/2008#, but this misses any dates on the final day that have a time component.)

Particularly in criteria or sorting, avoid VBA calls so JET can use the index.

 

cross SELECT ClientID, Surname & ", " + FirstName AS FullName
FROM tblClient
ORDER BY Surname & ", " & FirstName;
check SELECT ClientID, Surname & ", " + FirstName AS FullName
FROM tblClient
ORDER BY Surname, FirstName, ClientID;

Sorting on concatenated fields

Picture a combo box for selecting people by name. The ClientID is hidden, and Surname and FirstName are concatenated into one column so the full name is displayed even when the combo is not dropped down.

Do not sort by the concatenated field! Sort by the two fields, so JET can use the indexes on the fields to perform the sorting.


Optimize Totals queries

The JET query optimizer is very good, so you may find that simple queries are fast without the tips in this section. It is still worth the effort to create the best queries you can, so they don't suddenly slow down when you modify them.

cross SELECT ClientID, Count(InvoiceID) AS HowMany
FROM tblInvoice
GROUP BY ClientID
HAVING ClientID = 99;
check SELECT ClientID, Count(InvoiceID) AS HowMany
FROM tblInvoice
WHERE ClientID = 99
GROUP BY ClientID;

WHERE versus HAVING

Totals queries (those with a GROUP BY clause) can have both a WHERE clause and a HAVING clause. The WHERE is executed first - before aggregation; the HAVING is executed afterwards - when the totals have been calculated. It makes sense, then, to put your criteria in the WHERE clause, and use the HAVING clause only when you must apply criteria on the aggregated totals.

This is not obvious in the Access query designer. When you add a field to the design grid, Access sets the Total row to Group By, and the temptation is type your criteria under that. If you do, the criteria end up in the HAVING clause. To use the WHERE clause, add the field to the grid a second time, and choose Where in the Total row.

 

FIRST versus GROUP BY

cross SELECT EmployeeID, LastName, Notes
FROM Employees
GROUP BY EmployeeID, LastName, Notes;
check SELECT EmployeeID, First(LastName) AS FirstOfLastName,
First(Notes) AS FirstOfNotes
FROM Employees
GROUP BY EmployeeID;

When you add a field to a Totals query, Access offers Group By in the Total row. The default behavior, therefore, is that Access must group on all these fields.

A primary key is unique. So, if you group by the primary key field, there is no need to group by other fields in that table. You can optimize the query by choosing First instead of Group By in the Total row under the other fields. First allows JET to return the value from the first matching record, without needing to group by the field.

This makes a major difference with Memo fields. If you GROUP BY a memo (Notes in the example), Access compares only the first 255 characters, and the rest are truncated! By choosing First instead of Group By, JET is free to return the entire memo field from the first match. So not only is it more efficient; it actually solves the the problem of memo fields being chopped off.

(A downside of using First is that the fields are aliased, e.g. FirstOfNotes.)


Other optimizations

Other broad suggestions for optimizing queries in JET:

Query optimization is a huge topic. For further reading, see SQL Queries for Mere Mortals by Michael Hernandez and John Viescas.


HomeIndex of tipsTop