Warning sign in a street

Don't use functions around columns in your WHEREs: The worst T-SQL mistake

There’s one thing you could do to write faster queries in SQL Server: don’t use functions around columns in your WHERE clauses. I learned it the hard way. Let me share this lesson with you.

Don’t use user-defined or built-in functions around columns in the WHERE clause of queries. It prevents SQL Server from estimating the right amount of rows out of the function. Write queries with operators and comparisons to make SQL Server better use the indexes it has.

With functions around columns in WHEREs

To prove this point, let’s query StackOverflow database. Yes, the StackOverflow we all know and use.

StackOverflow has a Users table that contains, well…, all registered users and their profiles. Among other things, every user has a display name, location, and reputation.

Let’s find the first 50 users by reputation in Colombia.

To make things faster, let’s create an index on the Location field. It’s an NVARCHAR(100) column.

CREATE INDEX Location ON dbo.Users(Location);

This is the query we often write,

DECLARE @Location NVARCHAR(20) = N'Colombia';

SELECT TOP 50 DisplayName, Location, CreationDate, Reputation
FROM dbo.Users
/* Often, we put LOWER on both sides of the comparison */
WHERE LOWER(Location) = LOWER(@Location)
ORDER BY Reputation DESC;
GO

Did you notice the LOWER function on both sides of the equal sign?

We all have written queries like that one. I declared myself guilty too. Often, we use LOWER and UPPER or wrap the column around RTRIM and LTRIM.

But, let’s see what happened in the Execution Plan.

Execution plan of finding the first 50 StackOverflow users in Colombia
First 50 StackOverflow users in Colombia by reputation

We read execution plans from right to left and top to bottom.

Here, SQL Server chose to scan the index Location first. And notice the width of the arrow coming out of the first operator. When we place the cursor on it, it shows “Number of Rows Read.”

Rows read by Index Scan when finding StackOveflow users
Rows read by Index Scan on Location index

In this copy of the StackOverflow database, there are 2,465,713 users, only 463 living in Colombia. SQL Server has to read the whole content of the index to execute our query.

It means that to find all users in Colombia, SQL Server had to go through all users in the index. We could use that index in a better way.

Write queries with comparisons, functions, and operators around parameters. This way SQL Server could properly use indexes and have better estimates of the contents of tables. But, don’t write functions around columns in the WHERE clauses.

The same is true when joining tables. Don’t put functions around the foreign keys in your JOINs.

Rewrite your queries to avoid functions around columns in WHERE

Let’s go back and rewrite our query without any functions wrapping columns. This way,

DECLARE @Location NVARCHAR(20) = N'Colombia';

SELECT TOP 50 DisplayName, Location, CreationDate, Reputation
FROM dbo.Users
/* We remove LOWER on both sides */
WHERE Location = @Location
ORDER BY Reputation DESC;
GO

And, let’s check the execution plan again.

First 50 StackOverflow users in Colombia
Again, first 50 StackOverflow users in Colombia by reputation

This time, SQL Server used an Index Seek. It means SQL Server knew the starting point when reading the index on Location. And the execution plan didn’t go parallel. We don’t have the black arrows in yellow circles in the operators.

Notice, this time we have a thinner arrow on the first operator. Let’s see how many rows SQL Server read this time.

Rows read by Index Seek on Location index
Rows read by Index Seek on Location index

After that change, SQL Server only read 463 records. That was way better than reading the whole index.

Voilà! If you want to write faster queries, stop using functions around columns in your WHEREs. That screws SQL Server estimates. For example, don’t use LOWER or UPPER around your columns. By the way, SQL Server string searches are case insensitive by default. You don’t need those functions at all.

To read more SQL Server content, check how to and not to write Dynamic queries and six performance tuning tips from Pinal Dave.

Happy SQL time!