Show your work. Takeaways

Show your work is a New York Times bestseller by Austin Kleon. He describes his book as “a book for people who hate the very idea of self-promotion”. This book tells you how and why you should show your work online. These are my takeaways.

“Show your work” teaches that your work has to be out there. And, if your work isn’t online, it doesn’t exist. Good work isn’t enough.

Kid walking in a museum
Natural History Museum, London, United Kingdom. Photo by Michał Parzuchowski on Unsplash

“Be so good they can’t ignore you” summarizes the purpose of the book. To not be ignored, you have to be findable. Build a routine of sharing. Take advantage of your network. Imagine you don’t need a resume because your next boss already reads your blog.

“Be so good they can’t ignore you.”

-Steve Martin

You don’t have to be a genius

Creativity is about collaboration, too. Find a place where you can share your ideas and flourish your creativity. Find your creativity circle or “scenius”. Famous musicians and artists were surrounded by other artists to share, copy, acknowledge and sparkle their ideas.

Ask yourself what you want to learn. And make a commitment to learning in front of others. Share what you love, and the people who love the same thing will find you.

The minute you learn something, teach it
My favorite quote from Show Your Work

Share something small every day

There is no such a thing as overnight success. At the end of the day, see what you can share: inspiration, progress, or learning. Share what you think it’s helpful or entertaining.

Don’t be afraid of sharing your work. 90% of everything is crap. But don’t turn into human spam.

“Stop worrying, start sharing”

-From Show your work official trailer

Open up your cabinet of curiosities

Have a cabinet of curiosities. Before museums, people had a place to put what they loved, usually rare and weird things. Think of yourself as a collector.

Before you can share your work, you can share your taste. You can share what you read, who you follow, what inspires you. Credit your sources. Don’t share things you can’t credit.

Voilà! These are my takeaways from Show Your Work. This isn’t a programming book, but it has inspired me to continue writing, even when I think nobody is reading.

If you want to read other takeaways, check Clean Coder and Pragmatic Thinking and Learning.

#showyourwork

TIL: How to add gzip compression to ASP.NET Core API responses

Today, I got the report that one API endpoint took minutes to respond. It turned out that it returned hundreds of large complex objects. Those objects contained branding colors, copy text, and hotel configurations in a reservation system. This is how to add response compression in ASP.NET Core 6.0.

To compress responses with ASP.NET Core, register the default compression providers into the dependencies container with the UseResponseCompression() method.

Something like this,

var builder = WebApplication.CreateBuilder(args);
builder.Services.AddControllers();
builder.Services.AddResponseCompression();
//               ^^^^^

var app = builder.Build();
app.UseResponseCompression();
//  ^^^^^
app.MapControllers();
app.Run();

If we don’t specify any compression provider, ASP.NET Core uses a default one.

If we want gzip compression, then let’s register the GzipCompressionProvider inside AddResponseCompression() and set its compression level by configuring the GzipCompressionProviderOptions,

var builder = WebApplication.CreateBuilder(args);
builder.Services.AddControllers();

builder.Services.AddResponseCompression(options =>
{
    options.Providers.Add<GzipCompressionProvider>();
    //                ^^^^^
});

builder.services.Configure<GzipCompressionProviderOptions>(options => 
{
    options.Level = CompressionLevel.Fastest;
    //      ^^^^^
});

var app = builder.Build();
app.UseResponseCompression();
// ^^^^^
app.MapControllers();
app.Run();

For my slow endpoint, the easiest solution to speed it up was mapping my huge complex object to a new view model that only contained the properties the client side needed. I rolled a simple extension method MapToMyNewSimplifiedViewModel() for that.

Voilà! That’s how to add gzip compression to responses with ASP.NET Core 6.0. That’s what I learned today.

UPDATE (Oct 2023): In previous versions of ASP.NET Core, we needed the Microsoft.AspNetCore.ResponseCompression NuGet package. It’s deprecated. ASP.NET Core has response compression built in now. We don’t need NuGet packages for this.

For more ASP.NET Core content, check how to read configuration values, how to create a caching layer, and how to use background services with Hangfire.

Source: Response compression in ASP.NET Core

Two free tools to format SQL queries

Do you need to format your SQL queries? Are you doing it by hand? Stop! There is a better way!

Instead of formatting SQL queries to follow code conventions by hand, we can use online tools or extensions inside Visual Studio, SQL Server Management Studio, or any other text editor.

These are two free tools to format SQL queries and store procedures. Inside Notepad++, use Poor Man’s T-SQL Formatter. And, ApexSQL Refactor for Visual Studio and SQL Server Management Studio.

Before

Before using Poor Man’s T-SQL Formatter and ApexSQL Refactor, I spent too much time formatting queries by hand. I mean making keywords uppercase, aligning columns, and arranging spaces.

I tried to use the “Find and Replace” option inside a text editor. But it only worked for making keywords uppercase. Sometimes, I ended up messing with variables, parameters, and other things inside my queries.

Macro typewriter ribbon
Photo by Kelly Sikkema on Unsplash

Things were worse with long store procedures. I changed two lines and ended up formatting thousand of lines. “Once you touch it, you’re the owner.”

Let’s format a sample query from StackOverflow

Let’s format the query to find StackOverflow posts with many “thank you” answers.

select
   ParentId as [Post Link],
   count(id)
from posts
where posttypeid = 2 and len(body) <= 200
  and (body like '%hank%')
group by parentid
having count(id) > 1
order by count(id) desc;

After formatting the query to follow Simon Holywell SQL Style Guide, it should look like this,

SELECT ParentId AS [Post Link]
     , COUNT(id)
  FROM posts
 WHERE posttypeid = 2
   AND LEN(body) <= 200
   AND (body LIKE '%hank%')
 GROUP BY parentid
HAVING COUNT(id) > 1
 ORDER BY COUNT(id) DESC;

Let’s see how these two tools format our sample query.

1. Poor Man’s T-SQL Formatter

Poor Man’s T-SQL Formatter is a free and open-source .NET and JavaScript library to format your SQL queries. It’s available for Notepad++, Visual Studio, SQL Server Management Studio, and others. We can try it online too.

This is how Poor Man’s T-SQL formatted our sample query in Notepad++.

Sample query formatted by Poor Man's T-SQL inside Notepad++
Sample query formatted by Poor Man's T-SQL inside Notepad++

It doesn’t make function names uppercase. Notice the functions len and count.

Also, it indents AND clauses in the WHERE clause. I want them right-aligned to the previous WHERE. But it’s a good starting point.

Sometimes, it needs a bit of help if the query has single-line comments in it with --.

By the way, it’s better to use /* */ for single-line comments inside our queries and store procedures. This makes formatting easier when we copy queries or statements from our database’s plan cache.

2. ApexSQL Refactor

ApexSQL Refactor is a free query formatter for Visual Studio and SQL Server Management Studio. It has over 160 formatting options. We can create our own formatting profiles and preview them. It comes with four built-in profiles. Also, we can try it online.

UPDATE (Sept 2023): ApexSQL Refactor isn’t freely available online anymore.

This is how ApexSQL Refactor formatted our sample query in Visual Studio 2019.

Sample query formatted by ApexSQL Refactor inside Visual Studio
Sample query formatted by ApexSQL Refactor inside Visual Studio

It isn’t perfect, either. But, it makes functions uppercase. Point for ApexSQL Refactor.

Also, it indents AND clauses in the WHERE too. I couldn’t find an option to change it. But, there is an option to indent ON in SELECT statements with JOIN. It affects ON for index creation too. We can live with that.

Voilà! Please let’s save some time formatting our SQL queries with any of these two free tools.

We can take a step further and call a formatter inside a Git hook to automatically format SQL files. I did it with Poor Man’s T-SQL formatter.

For more content, check my Visual Studio setup for C# and six tips to performance tune our SQL Server.

Happy SQL time!

Six SQL Server performance tuning tips from Pinal Dave

Recently, I’ve needed to optimize some SQL Server queries. I decided to look out there what to do to tune SQL Server and SQL queries. This is what I found.

At the database level, turn on automatic update of statistics, increase the file size autogrowth and update the compatibility level. At the table level, delete unused indexes and create the missing ones, keeping around 5 indexes per table. And, at the query level, find and fix implicit conversions.

While looking up what I could do to tune my queries, I found Pinal Dave from SQLAuthority. Chances are you have already found one of his blog posts when searching for SQL Server tuning tips. He’s been blogging about the subject for years.

These are six tips from Pinal’s blog and online presentations I’ve applied recently. Please, let’s test these changes in a development or staging environment before making anything on our production servers.

1. Enable automatic update of statistics

Let’s turn on automatic update of statistics. We should turn it off if we’re updating a really long table during your work-hours.

This is how to enable automatic update of statistic update, [Source]

USE <YourDatabase>;
GO

-- Enable Auto Create of Statistics
ALTER DATABASE <YourDatabase>
SET AUTO_CREATE_STATISTICS ON WITH NO_WAIT;

-- Enable Auto Update of Statistics
ALTER DATABASE <YourDatabase>
SET AUTO_UPDATE_STATISTICS ON WITH NO_WAIT;
GO

-- Update Statistics for whole database
EXEC sp_updatestats
GO

2. Fix File Autogrowth

Let’s add size and file growth to our database. Let’s use our weekly file growth. Otherwise, let’s change it to 200 or 250MB.

From SQL Server Management Studio, to change the file autogrowth:

  1. Let’s go to our database properties and then to Files, then
  2. Click on the three dots in the Autogrowth column, and
  3. Change the file growth.
Files page from Database properties in SQL Server Management Studio
Files page from Database properties in SQL Server Management Studio

3. Find and Fix Implicit conversions

Implicit conversions happen when SQL Server needs to convert between two data types in a WHERE or in JOIN.

For example, if we compare a OrderNumber column being VARCHAR(20) to an INT parameter, SQL Server warns about an implicit conversion.

DECLARE @OrderNumber INT = 123;

SELECT *
FROM dbo.Orders
WHERE OrderNumber = @OrderNumber;
GO

To run this query, SQL Server has to go through all the rows in the dbo.Orders table to convert the OrderNumber from VARCHAR(20) to INT.

To decide when implicit conversions happen, SQL Server follows a precedence rule between data types. For example, SQL Server always converts VARCHAR to INT and NVARCHAR.

This script indentifies queries with implicit conversions, [Source]

SELECT TOP(50) DB_NAME(t.[dbid]) AS [Database Name], 
t.text AS [Query Text],
qs.total_worker_time AS [Total Worker Time], 
qs.total_worker_time/qs.execution_count AS [Avg Worker Time], 
qs.max_worker_time AS [Max Worker Time], 
qs.total_elapsed_time/qs.execution_count AS [Avg Elapsed Time], 
qs.max_elapsed_time AS [Max Elapsed Time],
qs.total_logical_reads/qs.execution_count AS [Avg Logical Reads],
qs.max_logical_reads AS [Max Logical Reads], 
qs.execution_count AS [Execution Count], 
qs.creation_time AS [Creation Time],
qp.query_plan AS [Query Plan]
FROM sys.dm_exec_query_stats AS qs WITH (NOLOCK)
CROSS APPLY sys.dm_exec_sql_text(plan_handle) AS t 
CROSS APPLY sys.dm_exec_query_plan(plan_handle) AS qp 
WHERE CAST(query_plan AS NVARCHAR(MAX)) LIKE ('%CONVERT_IMPLICIT%')
 AND t.[dbid] = DB_ID()
ORDER BY qs.total_worker_time DESC OPTION (RECOMPILE);

Let’s be aware, not all implicit convesions are bad. Often implicit conversions lead to scans or seeks. That’s why we should care about implicit conversions.

4. Change compatibility level

After updating our SQL Server, let’s make sure to update the compatibility level of our database to the highest level supported by the current version of our SQL Server.

We can change your SQL Server compatibility level using SQL Server Management Studio or with a query. [Source]

ALTER DATABASE <YourDatabase>
SET COMPATIBILITY_LEVEL = { 150 | 140 | 130 | 120 | 110 | 100 | 90 }

5. Find and Create missing indexes

Let’s create our missing indexes. But, let’s not create them all. Let’s create the first 10 missing indexes in our database and stick to around 5 indexes per table.

We can use the next script to find the missing indexes in our database. [Source] Let’s check the indexes we already have and the estimated impact of the missing indexes. Let’s not blindly follow index recommendations.

SELECT TOP 25
dm_mid.database_id AS DatabaseID,
dm_migs.avg_user_impact*(dm_migs.user_seeks+dm_migs.user_scans) Avg_Estimated_Impact,
dm_migs.last_user_seek AS Last_User_Seek,
OBJECT_NAME(dm_mid.OBJECT_ID,dm_mid.database_id) AS [TableName],
'CREATE INDEX [IX_' + OBJECT_NAME(dm_mid.OBJECT_ID,dm_mid.database_id) + '_'
+ REPLACE(REPLACE(REPLACE(ISNULL(dm_mid.equality_columns,''),', ','_'),'[',''),']','') 
+ CASE
WHEN dm_mid.equality_columns IS NOT NULL
AND dm_mid.inequality_columns IS NOT NULL THEN '_'
ELSE ''
END
+ REPLACE(REPLACE(REPLACE(ISNULL(dm_mid.inequality_columns,''),', ','_'),'[',''),']','')
+ ']'
+ ' ON ' + dm_mid.statement
+ ' (' + ISNULL (dm_mid.equality_columns,'')
+ CASE WHEN dm_mid.equality_columns IS NOT NULL AND dm_mid.inequality_columns 
IS NOT NULL THEN ',' ELSE
'' END
+ ISNULL (dm_mid.inequality_columns, '')
+ ')'
+ ISNULL (' INCLUDE (' + dm_mid.included_columns + ')', '') AS Create_Statement
FROM sys.dm_db_missing_index_groups dm_mig
INNER JOIN sys.dm_db_missing_index_group_stats dm_migs
ON dm_migs.group_handle = dm_mig.index_group_handle
INNER JOIN sys.dm_db_missing_index_details dm_mid
ON dm_mig.index_handle = dm_mid.index_handle
WHERE dm_mid.database_ID = DB_ID()
ORDER BY Avg_Estimated_Impact DESC
GO

6. Delete unused indexes

Indexes reduce perfomance all the time. They reduce performance of inserts, updates, deletes and selects. Even if a query isn’t using an index, it reduces performance of the query.

Let’s delete most our indexes. Let’s identify our “main” table and check if it has more than 5 indexes.

Also, let’s keep in mind if we rebuild an index for a table, SQL Server will remove all plans cached for that table.

Rebuilding indexes is the most expensive way of updating statistics.

We can find our unused indexes with the next script. [Source] Let’s look for indexes with zero seeks or scans and lots of updates. They’re good candidates to drop.

SELECT TOP 25
o.name AS ObjectName
, i.name AS IndexName
, i.index_id AS IndexID
, dm_ius.user_seeks AS UserSeek
, dm_ius.user_scans AS UserScans
, dm_ius.user_lookups AS UserLookups
, dm_ius.user_updates AS UserUpdates
, p.TableRows
, 'DROP INDEX ' + QUOTENAME(i.name)
+ ' ON ' + QUOTENAME(s.name) + '.'
+ QUOTENAME(OBJECT_NAME(dm_ius.OBJECT_ID)) AS 'drop statement'
FROM sys.dm_db_index_usage_stats dm_ius
INNER JOIN sys.indexes i ON i.index_id = dm_ius.index_id 
AND dm_ius.OBJECT_ID = i.OBJECT_ID
INNER JOIN sys.objects o ON dm_ius.OBJECT_ID = o.OBJECT_ID
INNER JOIN sys.schemas s ON o.schema_id = s.schema_id
INNER JOIN (SELECT SUM(p.rows) TableRows, p.index_id, p.OBJECT_ID
FROM sys.partitions p GROUP BY p.index_id, p.OBJECT_ID) p
ON p.index_id = dm_ius.index_id AND dm_ius.OBJECT_ID = p.OBJECT_ID
WHERE OBJECTPROPERTY(dm_ius.OBJECT_ID,'IsUserTable') = 1
AND dm_ius.database_id = DB_ID()
AND i.type_desc = 'nonclustered'
AND i.is_primary_key = 0
AND i.is_unique_constraint = 0
ORDER BY (dm_ius.user_seeks + dm_ius.user_scans + dm_ius.user_lookups) ASC
GO

Voilà! These are six tips I learned from Pinal Dave to start tuning your SQL Server. Let’s pay attention to your implicit conversions. You can get a surprise.

I gained a lot of improvement only by fixing implicit conversions. In a store procedure, we had a NVARCHAR parameter to compare it with a VARCHAR column. Yes, implicit conversions happen between VARCHAR and NVARCHAR.

After taking Brent Ozar’s Mastering courses, I learned to look at the overall SQL Server server health, focusing on the top wait type and the most expensive queries. Also, I started to use some of the stored procedures from the First Responder Kit repository on GitHub instead.

For more SQL and performance tuning content, check don’t use functions around columns in your WHEREs, what implicit conversions are and why you should care and just listen to index recommendations.

Happy SQL time!

BugOfTheDay: The slow room search

Another day at work! This time, the room search was running slow. For one of the big hotels, searching all available rooms in a week took about 15 seconds. This is how I optimized the room search functionality.

This room search was a public page to book a room into a hotel without using any external booking system. This page was like a custom Booking.com page. This page used an ASP.NET Core API project to combine data from different internal microservices to display the calendar, room images, and prices of a hotel.

1. Room type details

At first glance, I found an N+1 query problem. This is a common anti-pattern. The code called the database per each element from an input set to find more details about each item.

This N+1 problem was in the code to find the details of each room type. The code looked something like this:

public async Task<IEnumerable<RoomTypeViewModel>> GetRoomTypesAsync(int hotelId, IEnumerable<int> roomTypeIds)
{
    var tasks = roomTypeIds.Select(roomTypeId => GetRoomTypeAsync(hotelId, roomTypeId));
    //                      ^^^^^
    var results = await Task.WhenAll(tasks);
    return results;
}

public async Task<RoomTypeViewModel> GetRoomTypeAsync(int hotelId, int roomTypeId)
{
    var endpoint = BuildEndpoint(hotelId, roomTypeId);
    var roomClass = await _apiClient.GetJsonAsync<RoomTypeViewModel>(endpoint);
    //                               ^^^^^
    return roomClass;
}

These two methods made a request per each room type found, instead of searching more than one room type in a single call. I decided to create a new method to receive a list of room types.

I cloned the existing method in the appropriate microservice and renamed it. The new method received an array of room types. Also, I removed all unneeded queries from the store procedure it used. The store procedure returned three results sets. But, the room search only cared about one.

Before any change, it took ~4 seconds to find a single room type. But, with the new method, it took ~600ms to find more than one room type in a single request. The hotel facing the problem had about 30 different room types. Hurray!

After this first change, the room search made a single request to find all room types.

public async Task<IEnumerable<RoomTypeViewModel>> GetRoomTypesAsync(int hotelId, IEnumerable<int> roomTypeIds)
{
    var endpoint = BuildEndpoint(hotelId);
    var roomTypes = await _apiClient.PostJsonAsync<object, IEnumerable<RoomTypeViewModel>>(endpoint,  roomTypeIds);
    return roomTypes;
}

But, when calling the room search from Postman, the execution time didn’t seem to improve at all. These were some of the times for the room search for one week. What went wrong?

Room Type Time in seconds
Before 16.95 18.39 17.13
After 19.48 17.61 18.65
Encuentro Guadalupe, El Porvenir, Mexico. Photo by Manuel Moreno on Unsplash

2. Premature optimization

To check what happened, I stepped back and went to the room search method again. The room search method looked for all available rooms, the best rates and any restrictions to book a room. This method was something like this:

[Authorize]
[HttpGet]
public async Task<IActionResult> RoomSearchAsync([FromQuery] RoomSearchRequest request)
{   
    var hotelTimeZone = await GetHotelTimeZoneAsync(/* some parameters */);

    var roomListTask = GetRoomListAsync(/* some parameters */);
    var ratesTask = GetRatesAsync(/* some parameters */);
    var rulesTask = GetRulesAsync(/* some parameters */);

    await Task.WhenAll(roomListTask, ratesTask, propertyRulesTask);

    var roomList = await roomListTask;
    var rates = await ratesTask;
    var rules = await rulesTask;

    var roomSearchResults = await PrepareRoomSearchResultsAsync(rates, /* some parameters */);

    var result = new RoomSearchResponse
    {
        Rooms = roomSearchResults,
        Rules = rules
    };

    return Ok(result);
}

To find any bottlenecks, I wrapped some parts of the code using the Stopwatch class and log the elapsed time of them.

These are the log messages with the execution times before any change looked like this:

GetHotelTimeZoneAsync: 486ms
Task.WhenAll: 9641ms
    GetRatesAsync: 9588ms
    GetRoomListAsync: 7008ms
        FindAvailableRoomsAsync: 2792ms
        GetRoomTypesAsync:       4204ms
                                 ^^^^^
    GetRulesAsync: 3030ms
GetRoomTypeGaleriesAsync: 8228ms

But, the log after using the new method for room type details showed why it didn’t seem to improve. The log looked like this:

GetHotelTimeZoneAsync: 616ms
Task.WhenAll: 8726ms
    GetRatesAsync: 8667ms
    GetRoomListAsync: 4171ms
        FindAvailableRoomsAsync: 3602ms
        GetRoomTypesAsync:        559ms
                                  ^^^^^
    GetRulesAsync: 4223ms
GetRoomTypeGaleriesAsync: 11486ms

The GetRoomTypesAsync method run concurrently next to the GetRatesAsync method. This last one was the slowest of the three methods inside the Task.WhenAll. That’s why there was no noticeable improvement even though the time of the room type call dropped from 4204ms to 559ms.

“Premature optimization is the root of all evil”

-Donald Knuth

I was looking at the wrong place! I rushed to optimize without measuring anything. Lesson learned! I needed to start working either on GetHotelTimeZoneAsync or GetRoomTypeGaleriesAsync.

This time to gain noticeable improvement, I moved to the GetRoomTypeGaleriesAsync method. Again, this method called another microservice. The code looked like this:

public async Task<IEnumerable<RoomGallery>> GetRoomGalleriesAsync(IEnumerable<int> roomTypeIds)
{
    var roomGalleries = await _roomRepository.GetRoomImagesAsync(roomTypeIds);
    if (!roomGalleries.Any())
    {
        return Enumerable.Empty<RoomGallery>();
    }

    var hotelId = roomGalleries.First().HotelId;
    var roomGalleriesTasks = roomGalleries
        .Select(rg => Task.Run(()
        // ^^^^^
            => MapToRoomGallery(rg.RoomTypeId, hotelId, roomGalleries)));

    return (await Task.WhenAll(roomGalleriesTasks)).AsEnumerable();
}

private RoomGallery MapToRoomGallery(int roomTypeId, int hotelId, IEnumerable<RoomImageInfo> roomGalleries)
{
    return new RoomGallery
    {
        RoomTypeId = roomTypeId,
        HotelId = hotelId,
        Images = roomGalleries.Where(p => p.RoomTypeId == roomTypeId)
        //                     ^^^^^
            .OrderBy(x => x.SortOrder)
            .Select(r => new Image
            {
                // Some mapping code here...
            })
    };
}

The MapToRoomGallery method was the problem. It filtered the collection of result images, roomGalleries with every element. Basically, it was a nested loop over the same collection, an O(nm) operation. Also, since all the code was synchronous, there was no need for Task.Run and Task.WhenAll.

To fix this problem, I grouped the images by room type first. And then, I passed a filtered collection to the mapping method, MapToRoomGallery.

public async Task<IEnumerable<RoomGallery>> GetRoomGalleriesAsync(IEnumerable<int> roomTypeIds)
{
    var images = await _roomRepository.GetRoomImagesAsync(roomTypeIds);
    if (!images.Any())
    {
        return Enumerable.Empty<RoomGallery>();
    }

    var hotelId = images.First().HotelId;
    var imagesByRoomTypeId = images.GroupBy(t => t.RoomTypeId, (key, result) => new { RoomTypeId = key, Images = result });
    //                              ^^^^^
    var roomGalleries = imagesByRoomTypeId.Select(rg =>
    {
        return MapToRoomGallery(rg.RoomTypeId, hotelId, rg.Images);
    });
    return roomGalleries;
}

private RoomGallery MapToRoomGallery(int roomTypeId, int hotelId, IEnumerable<RoomImageInfo> roomGalleries)
{
    return new RoomGallery
    {
        RoomTypeId = roomTypeId,
        HotelId = hotelId,
        Images = roomGalleries.OrderBy(x => x.SortOrder)
        //                     ^^^^^
            .Select(r => new Image
            {
                // Some mapping code here
            })
    };
}

After changing those three lines of code, the image gallery times went from ~4sec to ~900ms. And, the initial room search improved in ~2-3sec. The hotel with the slowness issue had about 70 images. It was a step in the right direction.

These are the times of three requests to the initial room search using Postman:

Room Gallery Time in seconds
Before 13.96 13.49 17.64
After 11.74 11.19 11.23

When checking the log, the room search had a noticeable improvement for the GetRoomClassGaleriesAsync method. From ~8-11s to ~3-4s. Only by changing three lines of code.

GetHotelTimeZoneAsync: 182ms
Task.WhenAll: 8349ms
    GetRatesAsync: 8342ms
    GetRoomListAsync: 2886ms
        FindAvailableRoomsAsync: 2618ms
        GetRoomTypesAsync:        263ms
    GetRulesAsync: 2376ms
GetRoomClassGaleriesAsync: 3586ms
                           ^^^^^ It was ~11sec
                           

4. Room rates

To make things faster, I needed to tackle the slowest of the methods inside the Task.WhenAll, the GetRatesAsync method. It looked like this:

protected async Task<IEnumerable<BestRateViewModel>> GetRatesAsync(int hotelId, /* some other parameters */)
{
    var bestRatesRequest = new BestRatesRequest
    {
        // Some mapping code here
    };
    var bestRates = await _rateService.GetBestRatesAsync(bestRatesRequest);
    if (!bestRates.Any())
    {
        return Enumerable.Empty<BestRateViewModel>();
    }

    await UpdateRateDetailsAsync(bestRates, rateIds);
    await UpdatePackagesAsync(bestRates, hotelId);

    return bestRates;
}

Also, I logged the execution time of three more methods inside the GetRatesAsync. The log showed these entries:

GetHotelTimeZoneAsync: 530ms
Task.WhenAll: 12075ms
    GetRatesAsync: 12060ms
        GetBestRates: 5075ms
        UpdateRateDetailsAsync: 5741ms
                                ^^^^^
        UpdatePackagesAsync: 1222ms
    GetRoomListAsync: 3774ms
        FindAvailableRoomsAsync: 2555ms
        GetRoomTypesAsync:       1209ms
    GetRulesAsync: 2772ms
GetRoomClassGaleriesAsync: 4851ms

Among the inner methods was the UpdateRateDetailsAsync method. This method called an endpoint in another microservice for rates details. The method was something like this:

[HttpPost]
public async Task<IEnumerable<RateDetailViewModel>> GetRateDetailsAsync([FromBody] int[] rateIds)
{    
    var results = new List<RateDetailViewModel>();
    foreach (var rateId in rateIds)
    // ^^^^^
    {
        if (rateId <= 0)
        {
            results.Add(new RateDetailViewModel
            {
                RateId = rateId,
                Error = $"Invalid Rate Id: {rateId}"
            });
            continue;
        }
        
        try
        {
            var result = await _rateService.GetRateDetailAsync(rateId);
            //                              ^^^^^
            results.Add(new RateDetailViewModel
            {
                RateId = rateId,
                Data = result
            });
        }
        catch (Exception e)
        {
            results.Add(new RateDetailViewModel
            {
                RateId = rateId,
                Error = e.Message
            });
        }
    }

    return results;
}

Again, the N+1 query anti-pattern. Arrgggg! The hotel with the slowness issue had ~10 rates, it means 10 separate database calls. To fix this issue, I changed the code to validate all input ids and return early if there wasn’t any valid id to call the database. Then, I made a single request to the database. Either it succeeded or failed for all the valid input ids.

[HttpPost]
public async Task<IEnumerable<RateDetailViewModel>> GetRateDetailsAsync([FromBody] int[] rateIds)
{
    var results = rateIds.Where(rateId => rateId <= 0)
                        .Select(rateId => new RateDetailResultViewModel
                        {
                            RateId = rateId,
                            Error = $"Invalid Rate Id: {rateId}"
                        });
    if (results.Count() == rateIds.Length)
    // ^^^^^
    {
        return results;
    }

    var validRateIds = rateIds.Where(rateId => rateId > 0);

    try
    {
        var details = await m_rateService.GetRateDetailsAsync(validRateIds);
        //                                ^^^^^
        results = results.Concat(BuildRateDetailViewModels(validRateIds, details));
    }
    catch (Exception e)
    {
        var resultViewModels = validRateIds.Select(rateId => new RateDetailViewModel
        {
            RateId = rateId,
            Error = e.Message
        });
        results = results.Concat(resultViewModels);
    }

    return results;
}

After removing this N+1 problem, the execution time of getting details for ~10 rates went from ~1.6s to ~200-300ms. For three consecutive calls, these were the times of calling the modified rate method from Postman:

Room Rates Time in milliseconds
Before 1402 1868 1201
After 198 272 208

Also, the room search improved in ~2-3sec for 1-week range. The log showed these improvements too.

GetHotelTimeZoneAsync: 194ms
Task.WhenAll: 8349ms
    GetRatesAsync: 10838ms
        GetBestRates: 8072ms
        UpdateRateDetailsAsync: 2198ms
                                ^^^^^ It was ~5sec
        UpdatePackagesAsync: 557ms
    GetRoomListAsync: 3207ms
        FindAvailableRoomsAsync: 2982ms
        GetRoomTypesAsync:        218ms
    GetRulesAsync: 3084ms
GetRoomClassGaleriesAsync: 6370ms

It was the last low-hanging fruit issue I addressed. After the above three changes, the initial room search went from ~15-18sec to ~10-14sec. It was ~5 seconds faster.

These were the times of three requests to the room search after all these changes:

All changes Time in seconds
Before 16.95 18.39 17.13 15.36
After 11.36 10.46 14.62 10.38

Conclusion

Voilà! That’s how I optimized the room search functionality. Five seconds faster don’t seem too much. But, that’s the difference between someone booking a room in a hotel and someone leaving the page to find another hotel.

From this task, I learned two things. First, don’t assume a bottleneck is here or there until you measure it. And, avoid the N+1 query anti-pattern and nested loops on large collections.

I didn’t mess with any store procedure or SQL query trying to optimize it. But, I had some metrics in place and identified which was the store procedures to tune.

To find the bottlenecks, I took the simplest route wrapping methods with a Stopwatch, the next time I will use another alternative like MiniProfiler.

The next step I tried was to cache the hotel timezone and other details. Until a hotel changes its address, its timezone won’t change. You can take a look at my post on how to add a caching layer with Redis and ASP.NET Core for more details.

Happy coding!