What are implicit conversions and why you should care

SQL Server compares columns and parameters with the same data types. But, if the two data types are different, weird things happen. Let’s see what implicit conversions are and why we should care.

An implicit conversion happens when the data types of columns and parameters in comparisons are different. And SQL Server has to convert between them, following type precedence rules. Often, implicit conversions lead to unneeded index or table scans.

An implicit conversion that scans

Let’s see an implicit convention! For this, let’s create a new table from the StackOverflow Users table. But, this time, let’s change the Location data type from NVARCHAR to VARCHAR.

USE StackOverflow2013;
GO

CREATE TABLE dbo.Users_Varchar (Id INT PRIMARY KEY CLUSTERED, Location VARCHAR(100));
INSERT INTO dbo.Users_Varchar (Id, Location)
  SELECT Id, Location
  FROM dbo.Users;
GO

CREATE INDEX Location ON dbo.Users_Varchar(Location);
GO

Let’s find all users from Colombia. To prove a point, let’s query the dbo.Users_Varchar table instead of the original dbo.Users table.

DECLARE @Location NVARCHAR(20) = N'Colombia';

SELECT Id, Location
FROM dbo.Users_Varchar
/* The column is VARCHAR, but the parameter NVARCHAR */
WHERE Location = @Location;
GO

Notice we have declared @Location as NVARCHAR. We have a type mismatch between the column and the variable.

Let’s see the execution plan.

Execution plan of finding all users StackOverflow users from Colombia
StackOverflow users from Colombia

SQL Server had to scan the index on Location. But, why?

Warning sign on execution plan of finding all users StackOverflow users from Colombia
Warning sign on SELECT operator

Let’s notice the warning sign on the execution plan. When we hover over it, it shows the cause: SQL Server had to convert the two types. Yes! SQL Server converted between VARCHAR and NVARCHAR.

SQL Server data type precedence

To determine what types to convert, SQL Server follows a data type precedence order. This is a short version:

  1. datetimeoffset
  2. datetime2
  3. datetime
  4. smalldatetime
  5. date
  6. time
  7. decimal
  8. bigint
  9. int
  10. timestamp
  11. uniqueidentifier
  12. nvarchar (including nvarchar(max))
  13. varchar (including varchar(max))

SQL Server convert “lower” types to “higher” types. We don’t need to memorize this order. Let’s remember SQL Server always has to convert VARCHAR to other types.

For the complete list of type precedence between all data types, check Microsoft docs on SQL Server data Type precedence.

An implicit conversion that seeks

In our example, the VARCHAR type was on the column, on the left side of the comparison in the WHERE. It means SQL Server had to read the whole content of the index to convert and then compare. That’s more than 2 million rows. That’s why the index scan.

Let’s use the original dbo.Users table with Location as NVARCHAR and repeat the query. This time, switching the variable type to VARCHAR. What would be different?

DECLARE @Location VARCHAR(20) = 'Colombia';

SELECT Id, Location
/* We're filtering on the original Users table */
FROM dbo.Users
/* This time, the column is NVARCHAR, but the parameter VARCHAR */
WHERE Location = @Location;
GO

Now, the VARCHAR type is on the right of the comparison. It means SQL Server has to do one single conversion: the parameter.

Execution plan of finding all users StackOverflow users from Colombia
Index Seek on Location index when finding all users from Colombia

This time we don’t have a yellow bang on our execution plan. And, we have an Index Seek. Not all implicit conversions are bad.

In stored procedures and queries, use input parameters with the same types as the columns on the tables.

To identify which queries on your SQL Server have implicit conversions issues, we can use the third query from these six performance tuning tips from Pinal Dave. But, after taking Brent Ozar’s Mastering courses, I learned to start working with the most expensive queries instead of jumping to queries with implicit convertion issues right away.

Voilà! Those are implicit conversions and why you should care. Let’s use input parameters with the right data types on your queries and store procedures. Otherwise, we will pay the performance penalty of converting and comparing types. Implicit conversions are like functions around columns, implicitly added by SQL Server itself.

Let’s remember that not all implicit conversions are bad. When looking at execution plans, let’s check how many rows SQL Server reads to convert and compare things.

For more content on SQL Server, check how to compare datetimes without the time part, how to write case-sensitive searches and how to optimize queries with GROUP BY.

Happy coding!

Don't use functions around columns in your WHEREs: The worst T-SQL mistake

There’s one thing we could do to write faster queries in SQL Server: don’t use functions around columns in WHERE clauses. I learned it the hard way. Let me share this lesson with you.

Don’t use user-defined or built-in functions around columns in the WHERE clause of queries. It prevents SQL Server from estimating the right amount of rows out of the function. Write queries with operators and comparisons to make SQL Server better use the indexes it has.

With functions around columns in WHEREs

To prove this point, let’s query a local copy of the StackOverflow database. Yes, the StackOverflow we all know and use.

StackOverflow has a Users table that contains, well…, all registered users and their profiles. Among other things, every user has a display name, location, and reputation.

Let’s find the first 50 users by reputation in Colombia.

To make things faster, let’s create an index on the Location field. It’s an NVARCHAR(100) column.

CREATE INDEX Location ON dbo.Users(Location);

This is the query we often write,

DECLARE @Location NVARCHAR(20) = N'Colombia';

SELECT TOP 50 DisplayName, Location, CreationDate, Reputation
FROM dbo.Users
-- Often, we put LOWER on both sides of the comparison
WHERE LOWER(Location) = LOWER(@Location)
--    ^^^^^             ^^^^^
ORDER BY Reputation DESC;
GO

Did you notice the LOWER function on both sides of the equal sign?

We all have written queries like that one. I declared myself guilty too. Often, we use LOWER and UPPER or wrap the column around RTRIM and LTRIM.

But, let’s see what happened in the Execution Plan.

Execution plan of finding the first 50 StackOverflow users in Colombia
First 50 StackOverflow users in Colombia by reputation

Here, SQL Server chose to scan the index Location first. And let’s notice the width of the arrow coming out of the first operator. When we place the cursor on it, it shows “Number of Rows Read.”

Rows read by Index Scan when finding StackOveflow users
Rows read by Index Scan on Location index

In this copy of the StackOverflow database, there are 2,465,713 users, only 463 of them living in Colombia. SQL Server had to read the whole content of the index to execute our query. Arrrggg!

It means that to find all users in Colombia, SQL Server had to go through all users in the index. We could use that index in a better way.

Write queries with comparisons, functions, and operators around parameters. This way SQL Server could properly use indexes and have better estimates of the contents of tables. But, don’t write functions around columns in the WHERE clauses.

The same is true when joining tables. Let’s not put functions around the foreign keys in our JOINs either.

Rewrite your queries to avoid functions around columns in WHERE

Let’s go back and rewrite our query without any functions wrapping columns. This way,

DECLARE @Location NVARCHAR(20) = N'Colombia';

SELECT TOP 50 DisplayName, Location, CreationDate, Reputation
FROM dbo.Users
-- We remove LOWER on both sides
WHERE Location = @Location
ORDER BY Reputation DESC;
GO

And, let’s check the execution plan again.

First 50 StackOverflow users in Colombia
Again, first 50 StackOverflow users in Colombia by reputation

This time, SQL Server used an Index Seek. It means SQL Server didn’t have to read the whole content of the Location index to run our query. And the execution plan didn’t go parallel. We don’t have the black arrows in yellow circles in the operators.

Let’s notice that this time we have a thinner arrow on the first operator. Let’s see how many rows SQL Server read this time.

Rows read by Index Seek on Location index
Rows read by Index Seek on Location index

After that change, SQL Server only read 463 records. That was way better than reading the whole index.

Voilà! If we want to write faster queries, let’s stop using functions around columns in our WHEREs. That screws SQL Server estimates. For example, let’s not use LOWER or UPPER around our columns. By the way, SQL Server string searches are case insensitive by default, we don’t need those functions at all.

By the way, this is the same mistake we make when comparing DateTime in our SQL queries.

To read more SQL Server content, check what implicit conversions are and why you should care, how to optimize queries with GROUP BY, and just listen to SQL Server index recommendations.

Happy SQL time!

Monday Links: Better programming, flags, and C#

This episode of Monday Link is a bit diverse. From getting better at programming to why it’s harder for juniors to get hired. And, a rant about the C# evolution.

I couldn’t avoid adding the last article to this Monday Links. I try to only share five articles. But, that’s a great story of establishing priorities and putting life, health, and work in a balance. Enjoy!

A Path to Better Programming by Robert “Uncle Bob” Martin and Allen Holub

This is a conversation between Uncle Bob and Allen Holub (I didn’t know about Allen before this conversation) about principles, isolation, mob programming, and leadership.

By the way, the military has been leading large companies for years. How do they do it? The answer mentioned in the video is to empower people to do things. But there’s a whole book on the subject.

“If it’s a good idea, the good idea tends to spread, unless there’s somebody at the management level working hard to make it not spread”

Allen Holub

Also, I didn’t know there was a “Clean Agile” book. I would like to see what Agile meant for the creators and how it differs from what we call “Agile” today. Watch full video

Why flags do not represent languages

These days I needed to translate my Unit Testing 101 workshop instructions to Spanish. And call it fate or not, I found this article about not using flags to represent languages and what to do instead. How often do you see on web pages the US flag to signal English? Read full article

President and Mrs. Coolidge attended Thanksgiving Day service
There were days without daily meetings. Photo by Library of Congress on Unsplash

Is C# Getting Too Complex?

I like the evolution of C# as a language. I wrote about some of the newest and coolest C# features. But, I don’t like some of the new features. Some features, quoting the article, “just add to the complexity of the language without bringing anything to the table besides saving a few lines of code or a couple of curly braces.” Some features make the language less consistent. I share the main point of this article. Read full article.

Read the source code of your dependencies

I first heard about the concept of reading code as a learning exercise from a friend. To learn how star developers code, read their code. This post takes this concept to the next level. Use only dependencies you’re able to read its source code. Read full article.

Why nobody hires junior developers and what happens next

Another consequence of the pandemic. Nobody hires juniors. One thing I’ve noticed is companies don’t have clear career plans and salary policies for developers. Someone starts in a company and after a few years, what? Close tickets forever? Work extra hours to get noticed and promoted? That’s why juniors as soon as they feel more confident, move to a better-paid place. Brain drain is expensive. Read full article

Fetch the bolt cutters

Maybe I don’t have serious eye strain or carpal tunnel syndrome (yet?). But, I can totally relate to this story. And, it’s something more frequent these days. One quote that made me wow and stand in ovation: “In a choice between a job and me, I’m always going to choose me”. Such an amazing way to finish that story. Read full article

Voilà! Another five six reads! Do you also think C# is getting too complex? Have you seen a clear career path at companies where you’ve been?

Are you interested in unit testing? Grab a free copy of my ebook Unit Testing 101. Don’t miss the previous Monday Links on Workplaces, studying and communication.

Happy reading!

Best of 2021

In 2021, I switched from posting whenever I had an idea about anything to posting regularly every other week about some topics.

I wrote a whole series of posts about unit testing. From how to write your first unit test with MSTest to what stubs and mocks are. In fact, I wrote my first ebook Unit Testing 101 with some of the posts of the series. By the way, you can download it for free. No email asked.

These are the 5 posts I wrote in 2021 you read the most. In case you missed any of them, here they are:

  1. How to name your unit tests. 4 test naming conventions. This post is part of the Unit Testing 101 series. Four ideas to better name our unit tests. All of them, to write names easy to understand.
  2. What are fakes in unit testing: mocks vs stubs. Again, part of Unit Testing 101. This post shows what fakes, stubs, and mocks are. It might sound difficult, but it isn’t.
  3. TIL: How to convert 2-digit year to 4-digit year in C#. Last year, I worked with a Stripe-powered payment module in a reservation management system. And, I needed to parse card expiration dates from a 2-digit year to a 4-digit year. I almost just added 2000 to them. But I didn’t. This is what I found and did instead.
  4. Decorator pattern. A real example in C#. My favorite pattern. This post shows how to implement a retry mechanism on top of Stripe API client.
  5. A quick guide to LINQ with examples. I wrote this post to answer a friend. All you need to know to start working with LINQ in 15 minutes or less.

Voilà! These were your 5 favorite posts. Hope you enjoy them. Probably, you found digests of these posts on my dev.to account too.

Thanks for reading, and happy coding in 2022!

The Art of Readable Code: Takeaways

The Art of Readable Code is the perfect companion for the Clean Code. It contains simple and practical tips to improve your code at the function level. It isn’t as dogmatic as Clean Code. Tips aren’t as strict as the ones from Clean Code. But, it still deserves to be read.

These are some of my notes on The Art of Readable Code.

1. Code should be easy to understand

Code should be written to minimize the time for someone else to understand it. Here, understanding means solving errors, spotting bugs, and making changes.

It’s good to write compact code. But, compact doesn’t always mean more readable.

assert((!bucket = FindBucket(key)) || !bucket->IsOccupied());

// vs
bucket = FindBucket(key)
if (bucket != NULL) assert(!bucket->IsOccupied())

2. Surface-level improvements

Packing info into names

If something is critical to understand, put it in a name.

Choose specific words for your method names. For example, does def GetPage(url) get the page from the network, a cache or a database? Use FetchPage(url) or DownloadPage(url) instead.

Avoid empty names like retval, tmp, foo. Instead, use a variable name that describes the value. In the next code sample, use sum_squares instead of retval.

function norm(v) {
	var retval = 0;
	for (var i = 0; i < v.length; i++) {
		retval += v[i] *v [i]; // sum_squares = v[i] * v[i];
	}
	return retval;
}

Variables i, j, k don’t always work for loop indices. Prefer more concrete names. Indices could be misused or interchanged.

Use concrete over abstract names. Instead of --run-locally, use --extra-logging or --use-local-database.

Attach extra information to your names. For example, encode units. Prefer delay_secs over delay, size_mb over size and degrees over angle. Also, encode extra information. Prefer plaintext_password over password.

Do not include needless words in your names. Instead of ConvertToString, use ToString.

'Your Name Here' Sign on the Side of a Building
What name would you put in that sign? Photo by Austin Kirk on Unsplash

Names can’t be misconstructed

Make your names resistant to misinterpretation. Does a method on arrays named filter pick or get rid of elements? If it picks elements, use select(). And, if it gets rid of elements, use exclude().

// Does results have elements that satisfy the condition? Or elements that don't?
results = Database.all_objects.filter("year <= 2011")

Use Min and Max for inclusive limits. Put min or max in front of the thing being limited. For example,

if shopping_cart.num_items() > MAX_ITEMS_IN_CART:
	Error()

Use First and Last for inclusive limits. What’s the result of print integer_range(start=2, stop=4)? Is it [2,3] or [2,3,4]? Prefer First and Last. For example, print integer_range(first=2, last=4) to mean [2,3,4].

Use Begin and End for inclusive/exclusive ranges. For example, to find all events on a date, prefer PrintEventsInRange("OCT 16 12:00am", "OCT 17 12:00am") instead of PrintEventsInRange("OCT 16 12:00am", "OCT 16 11:59.999am").

For booleans variables, make clear what true or false means. Use is, has, can, should, need as prefixes. For example, SpaceLeft() or HasSpaceLeft().

Avoid negated booleans. For example, instead of disable_ssl = false, use use_ssl = true.

Don’t create false expectation. With these two names GetSize() and ComputeSize(), we expect GetSize() to be a lightweight operation.

Aesthetics

Similar code should look similar. Pick a meaningful order and maintain it. If the code mentions A, B, and C, don’t say B, C, and A in other places.

details = request.POST.get('details')
location = request.POST.get('location')
phone = request.POST.get('phone')

if phone: rec.phone = phone
if location: rec.location = location
if details: rec.details = details

Organize declarations into blocks, like sentences. Break code into paragraphs.

Knowing what to comment

Don’t comment what can be derived from code. GOOD CODE > BAD CODE + COMMENTS.

Comment the why’s behind a decision. Anticipate likely questions and comment the big picture. Comment why you choose a particular value or what it’s a valid range. For example, MAX_THREADS = 8; // Up to 2*num of procs

Making comments precise and compact

Use comments to show examples of input and output values.

// Strip("ab", "a") == "b"
String Strip(String str, String chars)
// CategoryType -> (score, weight)
typdef hash_map<int, pair<float, float>>

Use named parameters or use comments to make the same effect. Prefer connect(timeout: 10, use_ssl: false) over connect(10, false). In languages without named parameters, use connect(/*timeout_ms=*/10, /*use_ssl=*/false).

3. Simplifying loops and logic

Making control flow easy to read

When doing conditionals, write the changing variable first, followed by an operator and by stable expression. Prefer if (length >= 10) over if (10 <= length).

When doing if-then-else, treat the positive case first or the simplest case or the most interesting case first.

Use the ternary operator ?: with simple statements, not to squeeze logic into a single line. For example, time_str += (hour > 12) ? "pm" : "am"

Avoid do-while loops. Do-while loops break the convention of keeping the condition first. Use while instead.

Breaking down giant expressions

Write explaining variables and summary variables. For example,

if line.split(':')[0] == 'root':
  # ...

# vs

username = line.split(':')[0]
if username == 'root':
	# ...

Use De Morgan’s Laws. For example,

if (!(file_exists && !is_protected))
	Error()

# vs

if (!file_exists || is_protected)
	Error()

Eliminate intermediate results. Do your task as quickly as possible. For example, avoid writing functions like this

var remove_one = function() {
  var index = null;
  // find index
  if (index != null)
    // remove
}
Bunch of pencils organized by color
Let's keep our code organize, shall we? Photo by Lucas George Wendt on Unsplash

4. Reorganizing your code

Extracting unrelated problems

Separate the generic code from the specific code. Extract pure utility code into libraries or a set of functions.

Simplify existing interfaces. You should never have to settle for an interface that’s less than ideal.

To separate generic code from specific code, ask yourself:

  • What is the high-level goal of this code?
  • Is every line working to that goal? Or is it solving an unrelated problem?
  • If enough lines are working on an unrelated problem, extract that code into a separated function

For example, this code doesn’t separate generic from specific code

ajax.post({
	url: 'url',
	data: data,
	on_sucess: function(response) {
		// format_pretty
		var str = "{\n";
		for (var key in response)
			str += "-" + key + ":" + response[key];
		str += "}\n";
		alert(str);
	}
});

Now, format_pretty takes care of generic code.

ajax.post({
	url: 'url',
	data: data,
	on_sucess: function(response) {
		format_pretty(response);
	}
});

One thing at a time

List all the things your code is doing. And try to separate every task into a separate function.

Defragment your code to do one type of thing at a time. For example, initialize to default values, then calculate some data and lastly update some other values.

Writing less code

The most readable code is no code at all.

Keep your codebase as small and lightweight as possible. Create as much utility code to remove duplication. Remove unused code. Keep your project separated into isolated sub-projects

Know the capabilities of your libraries. Every once in a while spend 15 minutes reading the names of all functions/modules/types in your standard libraries. Write less code as possible.

For example, don’t be tempted to write your own de-duplicate code in Python.

def unique(elements):
	tmp = {}
	for e in elements:
		tmp[e] = None
	return tmp.keys()
	
unique([2, 1, 2])
# vs
unique = list(set([2, 1, 2]))

5. Testing and readability

Tests should be easy to understand. Coders are afraid of changing code, so coders don’t add new tests.

  • Hide less important details from the user, so he can focus only on the important ones.
  • Create the minimal test statement. Most tests can be reduced to: “given an input, expect this output.” Ideally a unit test is just 3-line long.
  • Create mini-languages.
  • Choose the simplest set of inputs that exercise your code. Prefer clean and simple test values. Embrace the Least astonishing principle.
  • Write smaller test cases, instead of a single perfect one-size-fits-all test case. Every test pushes your code in a different direction.
  • Test names should indicate a unit of work (or code under test), situation or bug being tested, and expected result.

Voilà! These are some of my notes. One tip I started to practice after reading this book was to “extract unrelated problems.”

The Art of Readable Code is a good starting point to introduce the concept of readability and clean code to your team. These tips and tricks are a good reference for code standards and reviews.

If you’re interested in other books, take a look at my takeaways from Clean Coder, Clean Code and Domain Modeling Made Functional.

Happy reading!