TIL: How to compare DateTime without the time part in SQL Server

If you use DATEDIFF() or CAST() to filter a table by a DATETIME column using only the date part, there’s a better way. Let’s find it out.

To compare dates without the time part, don’t use the DATEDIFF() or any other function on both sides of the comparison in a WHERE clause. Instead, put CAST() on the parameter and compare using >= and < operators.

Let’s use a local copy of the StackOverflow database to find all user profiles created on a particular date.

Inside StackOverflow database, there’s dbo.Users table with a CreationDate column. Let’s use that column to find all users who created their profiles today.

Before we get started, let’s create an index on CreationDate to make things faster.

CREATE INDEX CreationDate ON dbo.Users(CreationDate);

Probably, we would write a query like this one,

SELECT * FROM dbo.Users
WHERE DATEDIFF(DAY, CreationDate, GETDATE()) = 0;

But, SQL Server has to scan the entire index. Notice the “Index Scan” operator in the execution plan. SQL Server doesn’t have any statistics on CreationDate wrapped in a DATEDIFF() function.

Execution plan filtering a DateTime column with DATEDIFF
Execution plan filtering a DateTime column with DATEDIFF

An Index Scan by itself in an execution plan isn’t good or bad. It depends on the number of rows read.

In this case, SQL Server read all the records on the dbo.Users table. When we hover over the row next to the “Index Scan” operator, we notice the number of rows read. It scanned the whole index, more than 2 millions of records.

Execution plan showing the rows read
Execution plan showing the rows read

Let’s stop using DATEDIFF() or CAST() to filter a table on a DATETIME column.

To filter a table on a DATETIME column comparing only the date part, use CAST() only around the parameter, and >= and < with the desired date and the day after.

SELECT * FROM dbo.Users
 WHERE CreationDate >= CAST(GETDATE() AS DATE)
   AND CreationDate < DATEADD(day, 1, CAST(GETDATE() AS DATE));

Voilà! That’s how to compare dates on the WHERE clauses. Don’t use DATEDIFF() or CAST() on both sides of the comparison. In general, don’t put functions around columns in the WHERE clause.

For more content about SQL Server, check the difference between EXISTS SELECT 1 and EXISTS SELECT *, T-SQL doesn’t have constants and variables aren’t a good idea, and how LIKE handle NULL values.

Source: Optimized date compare in WHERE clause

Show your work. Takeaways

Show your work is a New York Times bestseller by Austin Kleon. He describes his book as “a book for people who hate the very idea of self-promotion”. This book tells you how and why you should show your work online. These are my takeaways.

“Show your work” teaches that your work has to be out there. And, if your work isn’t online, it doesn’t exist. Good work isn’t enough.

Kid walking in a museum
Natural History Museum, London, United Kingdom. Photo by Michał Parzuchowski on Unsplash

“Be so good they can’t ignore you” summarizes the purpose of the book. To not be ignored, you have to be findable. Build a routine of sharing. Take advantage of your network. Imagine you don’t need a resume because your next boss already reads your blog.

“Be so good they can’t ignore you.”

-Steve Martin

You don’t have to be a genius

Creativity is about collaboration, too. Find a place where you can share your ideas and flourish your creativity. Find your creativity circle or “scenius”. Famous musicians and artists were surrounded by other artists to share, copy, acknowledge and sparkle their ideas.

Ask yourself what you want to learn. And make a commitment to learning in front of others. Share what you love, and the people who love the same thing will find you.

The minute you learn something, teach it
My favorite quote from Show Your Work

Share something small every day

There is no such a thing as overnight success. At the end of the day, see what you can share: inspiration, progress, or learning. Share what you think it’s helpful or entertaining.

Don’t be afraid of sharing your work. 90% of everything is crap. But don’t turn into human spam.

“Stop worrying, start sharing”

-From Show your work official trailer

Open up your cabinet of curiosities

Have a cabinet of curiosities. Before museums, people had a place to put what they loved, usually rare and weird things. Think of yourself as a collector.

Before you can share your work, you can share your taste. You can share what you read, who you follow, what inspires you. Credit your sources. Don’t share things you can’t credit.

Voilà! These are my takeaways from Show Your Work. This isn’t a programming book, but it has inspired me to continue writing, even when I think nobody is reading.

If you want to read other takeaways, check Clean Coder and Pragmatic Thinking and Learning.

#showyourwork

TIL: How to add gzip compression to ASP.NET Core API responses

Today, I got the report that one API endpoint took minutes to respond. It turned out that it returned hundreds of large complex objects. Those objects contained branding colors, copy text, and hotel configurations in a reservation system. This is how to add response compression in ASP.NET Core 6.0.

To compress responses with ASP.NET Core, register the default compression providers into the dependencies container with the UseResponseCompression() method.

Something like this,

var builder = WebApplication.CreateBuilder(args);
builder.Services.AddControllers();
builder.Services.AddResponseCompression();
//               ^^^^^

var app = builder.Build();
app.UseResponseCompression();
//  ^^^^^
app.MapControllers();
app.Run();

If we don’t specify any compression provider, ASP.NET Core uses a default one.

If we want gzip compression, then let’s register the GzipCompressionProvider inside AddResponseCompression() and set its compression level by configuring the GzipCompressionProviderOptions,

var builder = WebApplication.CreateBuilder(args);
builder.Services.AddControllers();

builder.Services.AddResponseCompression(options =>
{
    options.Providers.Add<GzipCompressionProvider>();
    //                ^^^^^
});

builder.services.Configure<GzipCompressionProviderOptions>(options => 
{
    options.Level = CompressionLevel.Fastest;
    //      ^^^^^
});

var app = builder.Build();
app.UseResponseCompression();
// ^^^^^
app.MapControllers();
app.Run();

For my slow endpoint, the easiest solution to speed it up was mapping my huge complex object to a new view model that only contained the properties the client side needed. I rolled a simple extension method MapToMyNewSimplifiedViewModel() for that.

Voilà! That’s how to add gzip compression to responses with ASP.NET Core 6.0. That’s what I learned today.

UPDATE (Oct 2023): In previous versions of ASP.NET Core, we needed the Microsoft.AspNetCore.ResponseCompression NuGet package. It’s deprecated. ASP.NET Core has response compression built in now. We don’t need NuGet packages for this.

For more ASP.NET Core content, check how to read configuration values, how to create a caching layer, and how to use background services with Hangfire.

Source: Response compression in ASP.NET Core

Two free tools to format SQL queries

Do you need to format your SQL queries? Are you doing it by hand? Stop! There is a better way!

Instead of formatting SQL queries to follow code conventions by hand, we can use online tools or extensions inside Visual Studio, SQL Server Management Studio, or any other text editor.

These are two free tools to format SQL queries and store procedures. Inside Notepad++, use Poor Man’s T-SQL Formatter. And, ApexSQL Refactor for Visual Studio and SQL Server Management Studio.

Before

Before using Poor Man’s T-SQL Formatter and ApexSQL Refactor, I spent too much time formatting queries by hand. I mean making keywords uppercase, aligning columns, and arranging spaces.

I tried to use the “Find and Replace” option inside a text editor. But it only worked for making keywords uppercase. Sometimes, I ended up messing with variables, parameters, and other things inside my queries.

Macro typewriter ribbon
Photo by Kelly Sikkema on Unsplash

Things were worse with long store procedures. I changed two lines and ended up formatting thousand of lines. “Once you touch it, you’re the owner.”

Let’s format a sample query from StackOverflow

Let’s format the query to find StackOverflow posts with many “thank you” answers.

select
   ParentId as [Post Link],
   count(id)
from posts
where posttypeid = 2 and len(body) <= 200
  and (body like '%hank%')
group by parentid
having count(id) > 1
order by count(id) desc;

After formatting the query to follow Simon Holywell SQL Style Guide, it should look like this,

SELECT ParentId AS [Post Link]
     , COUNT(id)
  FROM posts
 WHERE posttypeid = 2
   AND LEN(body) <= 200
   AND (body LIKE '%hank%')
 GROUP BY parentid
HAVING COUNT(id) > 1
 ORDER BY COUNT(id) DESC;

Let’s see how these two tools format our sample query.

1. Poor Man’s T-SQL Formatter

Poor Man’s T-SQL Formatter is a free and open-source .NET and JavaScript library to format your SQL queries. It’s available for Notepad++, Visual Studio, SQL Server Management Studio, and others. We can try it online too.

This is how Poor Man’s T-SQL formatted our sample query in Notepad++.

Sample query formatted by Poor Man's T-SQL inside Notepad++
Sample query formatted by Poor Man's T-SQL inside Notepad++

It doesn’t make function names uppercase. Notice the functions len and count.

Also, it indents AND clauses in the WHERE clause. I want them right-aligned to the previous WHERE. But it’s a good starting point.

Sometimes, it needs a bit of help if the query has single-line comments in it with --.

By the way, it’s better to use /* */ for single-line comments inside our queries and store procedures. This makes formatting easier when we copy queries or statements from our database’s plan cache.

2. ApexSQL Refactor

ApexSQL Refactor is a free query formatter for Visual Studio and SQL Server Management Studio. It has over 160 formatting options. We can create our own formatting profiles and preview them. It comes with four built-in profiles. Also, we can try it online.

UPDATE (Sept 2023): ApexSQL Refactor isn’t freely available online anymore.

This is how ApexSQL Refactor formatted our sample query in Visual Studio 2019.

Sample query formatted by ApexSQL Refactor inside Visual Studio
Sample query formatted by ApexSQL Refactor inside Visual Studio

It isn’t perfect, either. But, it makes functions uppercase. Point for ApexSQL Refactor.

Also, it indents AND clauses in the WHERE too. I couldn’t find an option to change it. But, there is an option to indent ON in SELECT statements with JOIN. It affects ON for index creation too. We can live with that.

Voilà! Please let’s save some time formatting our SQL queries with any of these two free tools.

We can take a step further and call a formatter inside a Git hook to automatically format SQL files. I did it with Poor Man’s T-SQL formatter.

For more content, check my Visual Studio setup for C# and six tips to performance tune our SQL Server.

Happy SQL time!

Six SQL Server performance tuning tips from Pinal Dave

Recently, I’ve needed to optimize some SQL Server queries. I decided to look out there what to do to tune SQL Server and SQL queries. This is what I found.

At the database level, turn on automatic update of statistics, increase the file size autogrowth and update the compatibility level. At the table level, delete unused indexes and create the missing ones, keeping around 5 indexes per table. And, at the query level, find and fix implicit conversions.

While looking up what I could do to tune my queries, I found Pinal Dave from SQLAuthority. Chances are you have already found one of his blog posts when searching for SQL Server tuning tips. He’s been blogging about the subject for years.

These are six tips from Pinal’s blog and online presentations I’ve applied recently. Please, let’s test these changes in a development or staging environment before making anything on our production servers.

1. Enable automatic update of statistics

Let’s turn on automatic update of statistics. We should turn it off if we’re updating a really long table during your work-hours.

This is how to enable automatic update of statistic update, [Source]

USE <YourDatabase>;
GO

-- Enable Auto Create of Statistics
ALTER DATABASE <YourDatabase>
SET AUTO_CREATE_STATISTICS ON WITH NO_WAIT;

-- Enable Auto Update of Statistics
ALTER DATABASE <YourDatabase>
SET AUTO_UPDATE_STATISTICS ON WITH NO_WAIT;
GO

-- Update Statistics for whole database
EXEC sp_updatestats
GO

2. Fix File Autogrowth

Let’s add size and file growth to our database. Let’s use our weekly file growth. Otherwise, let’s change it to 200 or 250MB.

From SQL Server Management Studio, to change the file autogrowth:

  1. Let’s go to our database properties and then to Files, then
  2. Click on the three dots in the Autogrowth column, and
  3. Change the file growth.
Files page from Database properties in SQL Server Management Studio
Files page from Database properties in SQL Server Management Studio

3. Find and Fix Implicit conversions

Implicit conversions happen when SQL Server needs to convert between two data types in a WHERE or in JOIN.

For example, if we compare a OrderNumber column being VARCHAR(20) to an INT parameter, SQL Server warns about an implicit conversion.

DECLARE @OrderNumber INT = 123;

SELECT *
FROM dbo.Orders
WHERE OrderNumber = @OrderNumber;
GO

To run this query, SQL Server has to go through all the rows in the dbo.Orders table to convert the OrderNumber from VARCHAR(20) to INT.

To decide when implicit conversions happen, SQL Server follows a precedence rule between data types. For example, SQL Server always converts VARCHAR to INT and NVARCHAR.

This script indentifies queries with implicit conversions, [Source]

SELECT TOP(50) DB_NAME(t.[dbid]) AS [Database Name], 
t.text AS [Query Text],
qs.total_worker_time AS [Total Worker Time], 
qs.total_worker_time/qs.execution_count AS [Avg Worker Time], 
qs.max_worker_time AS [Max Worker Time], 
qs.total_elapsed_time/qs.execution_count AS [Avg Elapsed Time], 
qs.max_elapsed_time AS [Max Elapsed Time],
qs.total_logical_reads/qs.execution_count AS [Avg Logical Reads],
qs.max_logical_reads AS [Max Logical Reads], 
qs.execution_count AS [Execution Count], 
qs.creation_time AS [Creation Time],
qp.query_plan AS [Query Plan]
FROM sys.dm_exec_query_stats AS qs WITH (NOLOCK)
CROSS APPLY sys.dm_exec_sql_text(plan_handle) AS t 
CROSS APPLY sys.dm_exec_query_plan(plan_handle) AS qp 
WHERE CAST(query_plan AS NVARCHAR(MAX)) LIKE ('%CONVERT_IMPLICIT%')
 AND t.[dbid] = DB_ID()
ORDER BY qs.total_worker_time DESC OPTION (RECOMPILE);

Let’s be aware, not all implicit convesions are bad. Often implicit conversions lead to scans or seeks. That’s why we should care about implicit conversions.

4. Change compatibility level

After updating our SQL Server, let’s make sure to update the compatibility level of our database to the highest level supported by the current version of our SQL Server.

We can change your SQL Server compatibility level using SQL Server Management Studio or with a query. [Source]

ALTER DATABASE <YourDatabase>
SET COMPATIBILITY_LEVEL = { 150 | 140 | 130 | 120 | 110 | 100 | 90 }

5. Find and Create missing indexes

Let’s create our missing indexes. But, let’s not create them all. Let’s create the first 10 missing indexes in our database and stick to around 5 indexes per table.

We can use the next script to find the missing indexes in our database. [Source] Let’s check the indexes we already have and the estimated impact of the missing indexes. Let’s not blindly follow index recommendations.

SELECT TOP 25
dm_mid.database_id AS DatabaseID,
dm_migs.avg_user_impact*(dm_migs.user_seeks+dm_migs.user_scans) Avg_Estimated_Impact,
dm_migs.last_user_seek AS Last_User_Seek,
OBJECT_NAME(dm_mid.OBJECT_ID,dm_mid.database_id) AS [TableName],
'CREATE INDEX [IX_' + OBJECT_NAME(dm_mid.OBJECT_ID,dm_mid.database_id) + '_'
+ REPLACE(REPLACE(REPLACE(ISNULL(dm_mid.equality_columns,''),', ','_'),'[',''),']','') 
+ CASE
WHEN dm_mid.equality_columns IS NOT NULL
AND dm_mid.inequality_columns IS NOT NULL THEN '_'
ELSE ''
END
+ REPLACE(REPLACE(REPLACE(ISNULL(dm_mid.inequality_columns,''),', ','_'),'[',''),']','')
+ ']'
+ ' ON ' + dm_mid.statement
+ ' (' + ISNULL (dm_mid.equality_columns,'')
+ CASE WHEN dm_mid.equality_columns IS NOT NULL AND dm_mid.inequality_columns 
IS NOT NULL THEN ',' ELSE
'' END
+ ISNULL (dm_mid.inequality_columns, '')
+ ')'
+ ISNULL (' INCLUDE (' + dm_mid.included_columns + ')', '') AS Create_Statement
FROM sys.dm_db_missing_index_groups dm_mig
INNER JOIN sys.dm_db_missing_index_group_stats dm_migs
ON dm_migs.group_handle = dm_mig.index_group_handle
INNER JOIN sys.dm_db_missing_index_details dm_mid
ON dm_mig.index_handle = dm_mid.index_handle
WHERE dm_mid.database_ID = DB_ID()
ORDER BY Avg_Estimated_Impact DESC
GO

6. Delete unused indexes

Indexes reduce perfomance all the time. They reduce performance of inserts, updates, deletes and selects. Even if a query isn’t using an index, it reduces performance of the query.

Let’s delete most our indexes. Let’s identify our “main” table and check if it has more than 5 indexes.

Also, let’s keep in mind if we rebuild an index for a table, SQL Server will remove all plans cached for that table.

Rebuilding indexes is the most expensive way of updating statistics.

We can find our unused indexes with the next script. [Source] Let’s look for indexes with zero seeks or scans and lots of updates. They’re good candidates to drop.

SELECT TOP 25
o.name AS ObjectName
, i.name AS IndexName
, i.index_id AS IndexID
, dm_ius.user_seeks AS UserSeek
, dm_ius.user_scans AS UserScans
, dm_ius.user_lookups AS UserLookups
, dm_ius.user_updates AS UserUpdates
, p.TableRows
, 'DROP INDEX ' + QUOTENAME(i.name)
+ ' ON ' + QUOTENAME(s.name) + '.'
+ QUOTENAME(OBJECT_NAME(dm_ius.OBJECT_ID)) AS 'drop statement'
FROM sys.dm_db_index_usage_stats dm_ius
INNER JOIN sys.indexes i ON i.index_id = dm_ius.index_id 
AND dm_ius.OBJECT_ID = i.OBJECT_ID
INNER JOIN sys.objects o ON dm_ius.OBJECT_ID = o.OBJECT_ID
INNER JOIN sys.schemas s ON o.schema_id = s.schema_id
INNER JOIN (SELECT SUM(p.rows) TableRows, p.index_id, p.OBJECT_ID
FROM sys.partitions p GROUP BY p.index_id, p.OBJECT_ID) p
ON p.index_id = dm_ius.index_id AND dm_ius.OBJECT_ID = p.OBJECT_ID
WHERE OBJECTPROPERTY(dm_ius.OBJECT_ID,'IsUserTable') = 1
AND dm_ius.database_id = DB_ID()
AND i.type_desc = 'nonclustered'
AND i.is_primary_key = 0
AND i.is_unique_constraint = 0
ORDER BY (dm_ius.user_seeks + dm_ius.user_scans + dm_ius.user_lookups) ASC
GO

Voilà! These are six tips I learned from Pinal Dave to start tuning your SQL Server. Let’s pay attention to your implicit conversions. You can get a surprise.

I gained a lot of improvement only by fixing implicit conversions. In a store procedure, we had a NVARCHAR parameter to compare it with a VARCHAR column. Yes, implicit conversions happen between VARCHAR and NVARCHAR.

After taking Brent Ozar’s Mastering courses, I learned to look at the overall SQL Server server health, focusing on the top wait type and the most expensive queries. Also, I started to use some of the stored procedures from the First Responder Kit repository on GitHub instead.

For more SQL and performance tuning content, check don’t use functions around columns in your WHEREs, what implicit conversions are and why you should care and just listen to index recommendations.

Happy SQL time!