There’s one thing we could do to write faster queries in SQL Server: don’t use functions around columns in WHERE clauses. I learned it the hard way. Let me share this lesson with you.
Don’t use user-defined or built-in functions around columns in the WHERE clause of queries. It prevents SQL Server from estimating the right amount of rows out of the function. Write queries with operators and comparisons to make SQL Server better use the indexes it has.
With functions around columns in WHEREs
To prove this point, let’s query a local copy of the StackOverflow database. Yes, the StackOverflow we all know and use.
StackOverflow has a Users table that contains, well…, all registered users and their profiles. Among other things, every user has a display name, location, and reputation.
Let’s find the first 50 users by reputation in Colombia.
To make things faster, let’s create an index on the Location field. It’s an NVARCHAR(100) column.
CREATEINDEXLocationONdbo.Users(Location);
This is the query we often write,
DECLARE@LocationNVARCHAR(20)=N'Colombia';SELECTTOP50DisplayName,Location,CreationDate,ReputationFROMdbo.Users-- Often, we put LOWER on both sides of the comparisonWHERELOWER(Location)=LOWER(@Location)-- ^^^^^ ^^^^^ORDERBYReputationDESC;GO
Did you notice the LOWER function on both sides of the equal sign?
We all have written queries like that one. I declared myself guilty too. Often, we use LOWER and UPPER or wrap the column around RTRIM and LTRIM.
But, let’s see what happened in the Execution Plan.
First 50 StackOverflow users in Colombia by reputation
Here, SQL Server chose to scan the index Location first. And let’s notice the width of the arrow coming out of the first operator. When we place the cursor on it, it shows “Number of Rows Read.”
Rows read by Index Scan on Location index
In this copy of the StackOverflow database, there are 2,465,713 users, only 463 of them living in Colombia. SQL Server had to read the whole content of the index to execute our query. Arrrggg!
It means that to find all users in Colombia, SQL Server had to go through all users in the index. We could use that index in a better way.
Write queries with comparisons, functions, and operators around parameters. This way SQL Server could properly use indexes and have better estimates of the contents of tables. But, don’t write functions around columns in the WHERE clauses.
The same is true when joining tables. Let’s not put functions around the foreign keys in our JOINs either.
Rewrite your queries to avoid functions around columns in WHERE
Let’s go back and rewrite our query without any functions wrapping columns. This way,
DECLARE@LocationNVARCHAR(20)=N'Colombia';SELECTTOP50DisplayName,Location,CreationDate,ReputationFROMdbo.Users-- We remove LOWER on both sidesWHERELocation=@LocationORDERBYReputationDESC;GO
And, let’s check the execution plan again.
Again, first 50 StackOverflow users in Colombia by reputation
This time, SQL Server used an Index Seek. It means SQL Server didn’t have to read the whole content of the Location index to run our query. And the execution plan didn’t go parallel. We don’t have the black arrows in yellow circles in the operators.
Let’s notice that this time we have a thinner arrow on the first operator. Let’s see how many rows SQL Server read this time.
Rows read by Index Seek on Location index
After that change, SQL Server only read 463 records. That was way better than reading the whole index.
Voilà! If we want to write faster queries, let’s stop using functions around columns in our WHEREs. That screws SQL Server estimates. For example, let’s not use LOWER or UPPER around our columns. By the way, SQL Server string searches are case insensitive by default, we don’t need those functions at all.
This episode of Monday Link is a bit diverse. From getting better at programming to why it’s harder for juniors to get hired. And, a rant about the C# evolution.
I couldn’t avoid adding the last article to this Monday Links. I try to only share five articles. But, that’s a great story of establishing priorities and putting life, health, and work in a balance. Enjoy!
A Path to Better Programming by Robert “Uncle Bob” Martin and Allen Holub
This is a conversation between Uncle Bob and Allen Holub (I didn’t know about Allen before this conversation) about principles, isolation, mob programming, and leadership.
By the way, the military has been leading large companies for years. How do they do it? The answer mentioned in the video is to empower people to do things. But there’s a whole book on the subject.
“If it’s a good idea, the good idea tends to spread, unless there’s somebody at the management level working hard to make it not spread”
Allen Holub
Also, I didn’t know there was a “Clean Agile” book. I would like to see what Agile meant for the creators and how it differs from what we call “Agile” today. Watch full video
Why flags do not represent languages
These days I needed to translate my Unit Testing 101 workshop instructions to Spanish. And call it fate or not, I found this article about not using flags to represent languages and what to do instead. How often do you see on web pages the US flag to signal English? Read full article
I like the evolution of C# as a language. I wrote about some of the newest and coolest C# features. But, I don’t like some of the new features. Some features, quoting the article, “just add to the complexity of the language without bringing anything to the table besides saving a few lines of code or a couple of curly braces.” Some features make the language less consistent. I share the main point of this article. Read full article.
Read the source code of your dependencies
I first heard about the concept of reading code as a learning exercise from a friend. To learn how star developers code, read their code. This post takes this concept to the next level. Use only dependencies you’re able to read its source code. Read full article.
Why nobody hires junior developers and what happens next
Another consequence of the pandemic. Nobody hires juniors. One thing I’ve noticed is companies don’t have clear career plans and salary policies for developers. Someone starts in a company and after a few years, what? Close tickets forever? Work extra hours to get noticed and promoted? That’s why juniors as soon as they feel more confident, move to a better-paid place. Brain drain is expensive. Read full article
Fetch the bolt cutters
Maybe I don’t have serious eye strain or carpal tunnel syndrome (yet?). But, I can totally relate to this story. And, it’s something more frequent these days. One quote that made me wow and stand in ovation: “In a choice between a job and me, I’m always going to choose me”. Such an amazing way to finish that story. Read full article
Voilà! Another five six reads! Do you also think C# is getting too complex? Have you seen a clear career path at companies where you’ve been?
Want to receive curated links like these? Get 4 more delivered straight to your inbox every week. Don’t miss out on next week’s links—subscribe to my email list here.
In 2021, I switched from posting whenever I had an idea about anything to posting regularly every other week about some topics.
I wrote a whole series of posts about unit testing. From how to write your first unit test with MSTest to what stubs and mocks are. In fact, I wrote my first ebook Unit Testing 101 with some of the posts of the series. By the way, you can download it for free. No email asked.
These are the 5 posts I wrote in 2021 you read the most. In case you missed any of them, here they are:
TIL: How to convert 2-digit year to 4-digit year in C#. Last year, I worked with a Stripe-powered payment module in a reservation management system. And, I needed to parse card expiration dates from a 2-digit year to a 4-digit year. I almost just added 2000 to them. But I didn’t. This is what I found and did instead.
The Art of Readable Code is the perfect companion for the Clean Code. It contains simple and practical tips to improve your code at the function level. It isn’t as dogmatic as Clean Code. Tips aren’t as strict as the ones from Clean Code. But, it still deserves to be read.
These are some of my notes on The Art of Readable Code.
1. Code should be easy to understand
Code should be written to minimize the time for someone else to understand it. Here, understanding means solving errors, spotting bugs, and making changes.
It’s good to write compact code. But, compact doesn’t always mean more readable.
If something is critical to understand, put it in a name.
Choose specific words for your method names. For example, does def GetPage(url) get the page from the network, a cache or a database? Use FetchPage(url) or DownloadPage(url) instead.
Avoid empty names like retval, tmp, foo. Instead, use a variable name that describes the value. In the next code sample, use sum_squares instead of retval.
Variables i, j, k don’t always work for loop indices. Prefer more concrete names. Indices could be misused or interchanged.
Use concrete over abstract names. Instead of --run-locally, use --extra-logging or --use-local-database.
Attach extra information to your names. For example, encode units. Prefer delay_secs over delay, size_mb over size and degrees over angle. Also, encode extra information. Prefer plaintext_password over password.
Do not include needless words in your names. Instead of ConvertToString, use ToString.
Make your names resistant to misinterpretation. Does a method on arrays named filter pick or get rid of elements? If it picks elements, use select(). And, if it gets rid of elements, use exclude().
// Does results have elements that satisfy the condition? Or elements that don't?results=Database.all_objects.filter("year <= 2011")
Use Min and Max for inclusive limits. Put min or max in front of the thing being limited. For example,
Use First and Last for inclusive limits. What’s the result of print integer_range(start=2, stop=4)? Is it [2,3] or [2,3,4]? Prefer First and Last. For example, print integer_range(first=2, last=4) to mean [2,3,4].
Use Begin and End for inclusive/exclusive ranges. For example, to find all events on a date, prefer PrintEventsInRange("OCT 16 12:00am", "OCT 17 12:00am") instead of PrintEventsInRange("OCT 16 12:00am", "OCT 16 11:59.999am").
For booleans variables, make clear what true or false means. Use is, has, can, should, need as prefixes. For example, SpaceLeft() or HasSpaceLeft().
Avoid negated booleans. For example, instead of disable_ssl = false, use use_ssl = true.
Don’t create false expectation. With these two names GetSize() and ComputeSize(), we expect GetSize() to be a lightweight operation.
Aesthetics
Similar code should look similar. Pick a meaningful order and maintain it. If the code mentions A, B, and C, don’t say B, C, and A in other places.
Organize declarations into blocks, like sentences. Break code into paragraphs.
Knowing what to comment
Don’t comment what can be derived from code. GOOD CODE > BAD CODE + COMMENTS.
Comment the why’s behind a decision. Anticipate likely questions and comment the big picture. Comment why you choose a particular value or what it’s a valid range. For example, MAX_THREADS = 8; // Up to 2*num of procs
Making comments precise and compact
Use comments to show examples of input and output values.
Use named parameters or use comments to make the same effect. Prefer connect(timeout: 10, use_ssl: false) over connect(10, false). In languages without named parameters, use connect(/*timeout_ms=*/10, /*use_ssl=*/false).
3. Simplifying loops and logic
Making control flow easy to read
When doing conditionals, write the changing variable first, followed by an operator and by stable expression. Prefer if (length >= 10) over if (10 <= length).
When doing if-then-else, treat the positive case first or the simplest case or the most interesting case first.
Use the ternary operator ?: with simple statements, not to squeeze logic into a single line. For example, time_str += (hour > 12) ? "pm" : "am"
Avoid do-while loops. Do-while loops break the convention of keeping the condition first. Use while instead.
Breaking down giant expressions
Write explaining variables and summary variables. For example,
ifline.split(':')[0]=='root':# ...
# vs
username=line.split(':')[0]ifusername=='root':# ...
Use De Morgan’s Laws. For example,
if(!(file_exists&&!is_protected))Error()# vs
if(!file_exists||is_protected)Error()
Eliminate intermediate results. Do your task as quickly as possible. For example, avoid writing functions like this
List all the things your code is doing. And try to separate every task into a separate function.
Defragment your code to do one type of thing at a time. For example, initialize to default values, then calculate some data and lastly update some other values.
Writing less code
The most readable code is no code at all.
Keep your codebase as small and lightweight as possible. Create as much utility code to remove duplication. Remove unused code. Keep your project separated into isolated sub-projects
Know the capabilities of your libraries. Every once in a while spend 15 minutes reading the names of all functions/modules/types in your standard libraries. Write less code as possible.
For example, don’t be tempted to write your own de-duplicate code in Python.
defunique(elements):tmp={}foreinelements:tmp[e]=Nonereturntmp.keys()unique([2,1,2])# vs
unique=list(set([2,1,2]))
5. Testing and readability
Tests should be easy to understand. Coders are afraid of changing code, so coders don’t add new tests.
Hide less important details from the user, so he can focus only on the important ones.
Create the minimal test statement. Most tests can be reduced to: “given an input, expect this output.” Ideally a unit test is just 3-line long.
Create mini-languages.
Choose the simplest set of inputs that exercise your code. Prefer clean and simple test values. Embrace the Least astonishing principle.
Write smaller test cases, instead of a single perfect one-size-fits-all test case. Every test pushes your code in a different direction.
Test names should indicate a unit of work (or code under test), situation or bug being tested, and expected result.
Voilà! These are some of my notes. One tip I started to practice after reading this book was to “extract unrelated problems.”
The Art of Readable Code is a good starting point to introduce the concept of readability and clean code to your team. These tips and tricks are a good reference for code standards and reviews.
If you are curious about the functional world, but you found it too hard to start, this book is a good place to start. These are my notes and takeaways from “Domain Modeling Made Functional” by Scott Wlaschin.
“Domain modeling made functional” teaches to start coding only after understanding the domain. And to capture requirements, constraints, and business rules using types in the system.
This book covers from Domain-Driven Design (DDD) to type systems to refining an existing domain model. It’s a jump to the functional world by designing a system to price and ship orders in a supply store. No jargon or weird concepts needed.
All the code samples are in F#. Most of the time, they’re easy to understand. Since F# has a better type inference than C#, types aren’t explicit all the time while writing functions. Some code listings are hard to translate to C#.
To follow the code using C#, I had to rely upon libraries like OneOf and Optional to bring discriminated unions and option types, not built into the C# language yet.
1. DDD and Ubiquitous language
The goal of DDD is to share the business model between developers and domain experts. In DDD, the business domain drives the design, not the database schema. Don’t rush to think in terms of database tables.
Write code using the same vocabulary from the business domain. Domain experts or product owners don’t think in integers or strings. But, in business concepts like OrderId, ProductCode. When in doubt, ask your domain expert what an OrderBase, OrderHelper mean.
2. Types everywhere
A type is a name for all possible input and output values in a function. For example, Math functions from one set to another.
There are two ways to construct types: AND types and OR types. An AND type is a combination of other types. And an OR type is a choice between a known set of types. OR types are also called discriminated unions or choice types.
For example, a meal is a combination of an entrée, a main course, and a dessert. And, a dessert is either a cake or a flan. To represent a meal, we need an AND type. And for a dessert, an OR type.
In the Object-Oriented (OO) world, types are like classes without behavior or single-method interfaces. Types represent a set of functions too. AND types are regular classes with properties. While OR types are enums with, possibly, members of different types.
Express restrictions in your design and enforce constraints with new types.
For example, to represent unit quantities in orders, don’t use a plain integer. “Unit” is a concept in the business domain. It should be in a separate type, like UnitQuantity.
To restrict unit quantities between 1 and 1000, create a private constructor in the UnitQuantity type and only expose a factory method with the validation.
To enforce that an order should have at least one line item, create a NonEmptyList instead of a possibly empty List.
To work with exceptions in third-party code, wrap that code in a function that catches exceptions and returns a Result. Don’t catch all exceptions, only those relevant to the domain. Like, timeouts or failed logins.
Make illegal state unrepresentable
Convert a design with two choices to a design with one type per choice. Booleans are generally a bad design choice.
To represent validated emails, don’t create a CustomerEmail with an IsVerified flag. Instead, create separate types, VerifiedEmailAddress and UnverifiedEmailAddress.
With types, we don’t need unit tests for invalid situations, we have compile-time unit tests. Also, types better document the business domain.
4. Transformation-oriented programming
Write actions in your system as workflows or chains of transformations.
A workflow is a pipeline where the output type of every step is the input of the next one. A workflow receives commands and returns events as outputs. A command should have everything needed to start a workflow.
Put all the I/O at the edges. Call the database at the beginning or at the end of a workflow.
For example, these are the types of a PlaceOrderWorkflow and the types of its steps.
By using strings as parameters, we can pass any string to the receiving method. But, the only valid options are then ones inside the Constants class. Enums are better to make illegal states unrepresentable in this case.
Instead, write two separate classes: FailedPayment and SuccessfulPayment. Like this,
publicclassFailedPayment{publicboolHasError{get;}=true;publicboolErrorMessage{get;set;}// ...}publicclassSuccessfulPayment{// No HasError and ErrorMessage...// ...}
I had the chance to practice what I’ve learned with this book, even before finishing it.
Here are my requirements. After getting a reservation, hotels want to charge a deposit before the guests arrive. They want to charge some nights, a fixed amount, or a percentage, either after getting the reservation or before the arrival date.
This is what I came up with. Better than a single class full of string, bool, and int properties. That would have been my approach before reading this book.
Voilà! These are the lessons I learned from this book. It’s a good introduction to DDD and functional programming. If you’re not a functional programmer, you can still take advantage of the concepts of this book in your everyday programming. I did.
You can skip the last 2 or 3 chapters. If you’re new to F# and want to work with it, they contain good examples of serialization and databases. But, if you want to adopt these concepts to your OO world, you can skim these chapters for the main concepts.
“A class with a boolean to represent two states is generally a smell.”