Three Language Lessons I Learned on my First Visit to France

This post is about one of my hobbies: learning new languages. But not programming languages. Foreign languages.

I want to share three lessons I learned while traveling to France to practice my French-speaking skills. Each lesson is behind a funny story that happened on my first visit to France.

Three language lessons I learned in my First visit to France
Photo by Byeong woo Kang on Unsplash

These are the three lessons I learned:

  1. Don’t Fall into the temptation of speaking English. Keep speaking in your target language when locals talk to you in English. Don’t think they’re rude or you aren’t “worthy” of their language. They want to practice too. Ask politely to continue speaking in your target language. Or pretend you don’t speak English.
  2. Learn vocabulary to suit your needs. If you’re traveling for work, vacations, or cultural exchange, you will need vocabulary for totally different situations. I learned this lesson while waiting at the security at an airport. Can you imagine what happened?
  3. Mistakes are progress. Embrace it when locals correct your language skills. Don’t feel discouraged when locals correct your mistakes. Imagine you got a free language lesson. Probably you won’t make that mistake again.

To read the full story, go to “Fluent in 3 Months” at 3 Language Lessons I Learned on my First Visit to France where I originally published it.

Happy learning!

TIL: NULL isn't LIKE anything else in SQL Server

How does the LIKE operator handle NULL values of a column? Let’s see what SQL Server does when using LIKE with a nullable column.

When using the LIKE operator on a nullable column, SQL Server doesn’t include in the results rows with NULL values in that column. The same is true, when using NOT LIKE in a WHERE clause.

Let’s see an example. Let’s create a Client table with an ID, name and middleName. Only two of the four sample clients have a middlename.

CREATE TABLE #Clients
(
    ID INT,
    Name VARCHAR(20),
    MiddleName VARCHAR(20)
)
GO

INSERT INTO #Clients
VALUES
    (1, 'Alice', 'A'),
    (2, 'Bob',   NULL),
    (3, 'Charlie', 'C'),
    (4, 'Dwight',  NULL)
GO

Let’s find all users with middlename starting and not starting with ‘A’.

SELECT *
FROM #Clients
WHERE MiddleName LIKE 'A%'
GO

SELECT *
FROM #Clients
WHERE MiddleName NOT LIKE 'A%'
GO

Notice the results don’t include any rows with NULL middlenames.

Results of querying a nullable column with LIKE
Results of querying a nullable column with LIKE and NOT LIKE

Voilà! That’s how SQL Server handle NULL when using LIKE and NOT LIKE. Remember you don’t need to check for null values.

If you want to read more SQL Server content, check six performance tuning tips and the lessons learned while tuning a store procedure to search reservations.

Source: NULL is NOT LIKE and NOT NOT LIKE

BugOfTheDay: How I tuned a procedure to find reservations

This time, one of the searching features for reservations was timing out. The appropiate store procedure took ~5 minutes to finish. This is how I tuned it.

To tune a store procedure, start by looking for expensive operators in its Actual Execution plan. Reduce the number of joining tables and stay away from common bad practices like putting functions around columns in WHERE clauses.

After opening the actual exection plan with SentryOne Plan Explorer, the most-CPU expensive and slowest statement looked like this:

DELETE res
FROM #resTemp res
WHERE reservationID NOT IN (
        SELECT res1.reservationID
        FROM #resTemp res1
        JOIN dbo.reservations res
            ON res.reservationID = res1.reservationID
        JOIN dbo.accounts a
            ON (a.accountID = res.accountID
                  OR a.accountID = res.columnWithAccountID
                  OR a.accountID = res.columnWithAccountIDToo)
                AND a.clientID = @clientID
        WHERE ISNULL(a.accountNumber, '') + ISNULL(a.accountNumberAlpha, '') LIKE @accountNumber + '%'
        );

This query belonged to a store procedure to search reservations by a bunch of filters. Among its filters, a hotelier can find all reservations assigned to a client’s internal account number.

From the above query, the #resTemp table had reservations from previous queries in the same store procedure. The DELETE statement removes all reservations without the given account number.

Inside SQL Server Management Studio, the store procedure did about 193 millions of logical reads to the dbo.accounts table. That’s a lot!

For SQL Server, logical reads are the number of 8KB pages that SQL Server has to read to execute a query. Generally, the fewer logical reads, the faster a query runs.

1. Remove extra joins

The subquery in the DELETE joined the found reservations with the dbo.reservations table. And then, it joined the dbo.accounts table checking for any of the three columns with an accountID. Yes, a reservation could have an accountID in three columns in the same table. Don’t ask me why.

This subquery performed an Index Scan on the dbo.reservations table. It had a couple of millions of records. That’s the main table in any Reservation Management System.

To remove the extra join to the dbo.reservations table in the subquery, I added the three referenced columns (accountID, columnWithAccountID, columnWithAccountIDToo) inside the ON joining the dbo.accounts to the #resTemp temporary table. By the way, those aren’t the real names of those columns.

After this change, the store procedure took ~8 seconds. It read about 165,000 pages for the dbo.accounts table. Wow!

DELETE res
FROM #resTemp res
WHERE reservationID NOT IN (
        SELECT res1.reservationID
        FROM #resTemp res1
        /* We don't need the extra JOIN here */
        INNER JOIN dbo.accounts a
            ON (a.accountID = res1.accountID
                  OR a.accountID = res1.columnWithAccountID
                  OR a.accountID = res1.columnWithAccountIDToo)
                AND a.clientID = @clientID
        WHERE ISNULL(a.accountNumber, '') + ISNULL(a.accountNumberAlpha, '') LIKE @accountNumber + '%'
        );

2. Use NOT EXISTS

Then, instead of NOT IN, I used NOT EXISTS. This way, I could lead the subquery from the dbo.accounts table. Another JOIN gone!

After this change, the store procedure finished in about 5 seconds.

DELETE res
FROM #resTemp res
WHERE NOT EXISTS (
        SELECT 1/0
        /* Again, we got rid of another JOIN */
        FROM dbo.accounts a
        WHERE (a.accountID = res.accountID
                OR a.accountID = res.columnWithAccountID
                OR a.accountID = res.columnWithAccountIDToo)
            AND a.clientID = @clientID
            AND ISNULL(a.accountNumber, '') + ISNULL(a.accountNumberAlpha, '') LIKE @accountNumber + '%'
        );

Those ~4-5 seconds were good enough. But, there was still room for improvement.

If you're wondering about that weird SELECT 1/0, check my post on EXISTS SELECT in SQL Server

3. Don’t use functions in WHERE’s

The ISNULL() functions in the WHERE look weird. Using functions around columns in WHERE clauses is a common anti-pattern.

In this case, a computed column concatenating the two parts of account numbers would help. Yes, account numbers were stored splitted into two columns. Again, don’t ask me why.

ALTER TABLE dbo.accounts
    ADD AccountNumberComplete
    AS ISNULL(accountNumber, '') + ISNULL(accountNumberAlpha, '');

I didn’t use a persisted column. The dbo.accounts table was a huge table, creating a persisted columns would have required scanning the whole table. I only wanted SQL Server to have better statistics to run the DELETE statement.

To take things even further, an index leading on the ClientId followed by that computed column could make things even faster.

CREATE INDEX ClientID_AccountNumberComplete
    ON dbo.accounts(ClientID, AccountNumberComplete);

I didn’t need to include the accountId on the index definition since it was the primary key of the table.

Voilà! That’s how I tuned this query. The lesson to take home is to reduce the number of joining tables and stay away from functions in your WHERE’s. Often, a computed column can help SQL Server to run queries with functions in the WHERE clause. Even, without rewriting the query to use the new computed column.

For more content about SQL Server, check Six SQL Server tuning tips and Two free tools to format your SQL queries.

Happy coding!

TIL: EXISTS SELECT 1 vs EXISTS SELECT * in SQL Server

EXISTS is a logical operator that checks if a subquery returns any rows. EXISTS works only with SELECT statements inside the subquery. Let’s see if there are any differences between EXISTS with SELECT * and SELECT 1.

There is no difference between EXISTS with SELECT * and SELECT 1. SQL Server generates similar execution plans in both scenarios. EXISTS returns true if the subquery returns one or more records, even if it returns NULL or 1/0.

Let’s use a local copy of the StackOverflow database to find users from Antartica who have left any comments. Yes, the same StackOverflow we use everyday to copy and paste code.

Let’s check how the execution plans look like when using SELECT * and SELECT 1 in the subquery with the EXISTS operator.

1. EXISTS with “SELECT *”

This is the query to find all users from Antartica who have commented anything. This query uses EXISTS with SELECT *.

SELECT *
FROM dbo.Users u
WHERE u.Location = 'Antartica'
AND EXISTS(SELECT * FROM dbo.Comments c WHERE u.Id = c.UserId);
--         ^^^^^^^^

To make things faster, let’s add one index on Location and another one on UserId on the dbo.Users and dbo.Comments tables, respectively.

CREATE INDEX UserId ON dbo.Comments(UserId);
CREATE INDEX Location ON dbo.Users(Location);
GO

Let’s check the execution plan. Notice the “Left Semi Join” operator and the other operators.

Execution plan using EXISTS with 'SELECT *'
Execution plan using EXISTS with 'SELECT *'

2. EXISTS with “SELECT 1”

Now, let’s change the subquery inside the EXISTS to use SELECT 1.

SELECT *
FROM dbo.Users u
WHERE u.Location = 'Antartica'
AND EXISTS(SELECT 1 FROM dbo.Comments c WHERE u.Id = c.UserId)
--         ^^^^^^^^

Again, let’s see the execution plan.

Execution plan using EXISTS with 'SELECT 1'
Execution plan using EXISTS with 'SELECT 1'

Voilà! Notice, there is no difference between the two execution plans when using EXISTS with SELECT * and SELECT 1. We don’t need to write SELECT TOP 1 1 inside our EXISTS subqueries. We can even rewrite our queries to use SELECT NULL or SELECT 1/0 without any division-by-zero error.

If you want to read more SQL and SQL Server content, check how to write Dynamic SQL and three differences between TRUNCATE and DELETE.

Happy SQL time!

TIL: How to compare DateTime without the time part in SQL Server

If you use DATEDIFF() or CAST() to filter a table by a DATETIME column using only the date part, there’s a better way. Let’s find it out.

To compare dates without the time part, don’t use the DATEDIFF() or any other function on both sides of the comparison in a WHERE clause. Instead, put CAST() on the parameter and compare using >= and < operators.

Let’s use a local copy of the StackOverflow database to find all user profiles created on a particular date.

Inside StackOverflow database, there’s dbo.Users table with a CreationDate column. Let’s use that column to find all users who created their profiles today.

Before we get started, let’s create an index on CreationDate to make things faster.

CREATE INDEX CreationDate ON dbo.Users(CreationDate);

Probably, we would write a query like this one,

SELECT * FROM dbo.Users
WHERE DATEDIFF(DAY, CreationDate, GETDATE()) = 0;

But, SQL Server has to scan the entire index. Notice the “Index Scan” operator in the execution plan. SQL Server doesn’t have any statistics on CreationDate wrapped in a DATEDIFF() function.

Execution plan filtering a DateTime column with DATEDIFF
Execution plan filtering a DateTime column with DATEDIFF

An Index Scan by itself in an execution plan isn’t good or bad. It depends on the number of rows read.

In this case, SQL Server read all the records on the dbo.Users table. When we hover over the row next to the “Index Scan” operator, we notice the number of rows read. It scanned the whole index, more than 2 millions of records.

Execution plan showing the rows read
Execution plan showing the rows read

Let’s stop using DATEDIFF() or CAST() to filter a table on a DATETIME column.

To filter a table on a DATETIME column comparing only the date part, use CAST() only around the parameter, and >= and < with the desired date and the day after.

SELECT * FROM dbo.Users
 WHERE CreationDate >= CAST(GETDATE() AS DATE)
   AND CreationDate < DATEADD(day, 1, CAST(GETDATE() AS DATE));

Voilà! That’s how to compare dates on the WHERE clauses. Don’t use DATEDIFF() or CAST() on both sides of the comparison. In general, don’t put functions around columns in the WHERE clause.

For more content about SQL Server, check the difference between EXISTS SELECT 1 and EXISTS SELECT *, T-SQL doesn’t have constants and variables aren’t a good idea, and how LIKE handle NULL values.

Source: Optimized date compare in WHERE clause