TIL: How to do a case-sensitive search in SQL Server21 Feb 2022 #todayilearned #sql
Do you use LOWER or UPPER to do case-sensitive searches? Let’s see how to write a case-sensitive search in SQL Server.
To write a case-sensitive search in SQL Server, don’t use UPPER or LOWER functions around the column to filter. Instead, use the COLLATE keyword with a case-sensitive collation followed by the comparison.
Naive case sensitive search
Often by mistake, to do a case-sensitive search, we wrap a column around
LOWER(). Something like this,
SELECT TOP 50 DisplayName, Location, CreationDate, Reputation FROM dbo.Users WHERE LOWER(DisplayName) LIKE 'john%' ORDER BY Reputation DESC; GO
We tried to find the first 50 StackOverflow users with DisplayName containing lowercase ‘john’.
But, interestingly enough. Some results don’t match our intended filter. They include both lowercase and uppercase ‘john’.
Don’t try to use
UPPER() either. It would be the same. In general, don’t use functions around columns in your WHEREs. That’s a common bad practice.
Collation and case sensitivity
In SQL Server, collations provide sorting rules and case and accent sensitivity for our data. For example, when we use an ORDER BY, the sort order of our results depends on the database collation.
Find the database collation on the Database Properties option under the Maintenance menu.
The default collation for English is SQL_Latin1_General_CP1_CI_AS. This is a case-insensitive collation.
In collation names, CI means case insensitive. And, AS means accent sensitive.
That’s why when we wrote
LOWER(DisplayName), SQL Server showed uppercase and lowercase results. SQL Server used the default collation, which was a case-insensitive one.
For more details about collations, check Microsoft docs on collations and Unicode support.
Case sensitive searches in WHERE clause
For case-sensitive searches in SQL Server, use COLLATE keyword with the case sensitive collation “SQL_Latin1_General_CP1_CS_AS” followed by the LIKE or = comparisons as usual.
When using the SQL_Latin1_General_CP1_CS_AS collation, a is different from A, and à is different from á. It’s both case and accent-sensitive.
Let’s rewrite our query with COLLATE,
SELECT TOP 50 DisplayName, Location, CreationDate, Reputation FROM dbo.Users WHERE DisplayName COLLATE SQL_Latin1_General_CP1_CS_AS LIKE '%john' ORDER BY Reputation DESC; GO
This time, we have the results we were expecting. Only users with lowercase ‘john’ in their display name.
Voilà! That’s how we can write case-sensitive searches in SQL Server. Remember, don’t use LOWER or UPPER. They won’t work for case-sensitive searches. Use a different collation instead.
For more content on SQL Server, check what are implicit conversions and why you should care and six performance tuning tips from Pinal Dave.