How to automatically format SQL files with Git and Poor Man's T-SQL Formatter

I believe we shouldn’t discuss formatting and linting during code reviews. That should be automated. With that in mind, these days, I learned how to automatically format SQL files with Git and Poor Man’s T-SQL Formatter for one of my client’s projects.

I already shared about two free tools to format SQL files. Poor Man’s T-SQL Formatter is one of the two. It’s free and open source.

1. Format SQL files on commits

I wanted to format my SQL files as part of my development workflow. I thought about a pre-commit Git hook for that. I was already familiar with Git hooks since I use one to put task numbers from branch names into commit messages.

After searching online, I found a Bash script to list all created, modified, and renamed files before committing them. I used Phind, “the AI search engine for developers.” These are the query I used:

  • “How to create a git commit hook that lists all files with .sql extension?” and as a follow-up,
  • “What are all possible options for the parameter –diff-filter on the git diff command?”

Also, I found out that Poor Man’s T-SQL Formatter is available as a Node.js command utility.

Using these two pieces, this is the pre-commit file I came up with,

#!/bin/sh

files=$(git diff --cached --name-only --diff-filter=ACMR)
[ -z "$files" ] && exit 0

for file in "${files[@]}"
do
    if [[ $file == *.sql ]]
    then
        echo "Formatting: $file"

        # 1. Prettify it
        sqlformat -f "$file" -g "$file" --breakJoinOnSections --no-trailingCommas --spaceAfterExpandedComma

        # 2. Add it back to the staging area
        git add $file
    fi
done

exit 0

I used these three options: --breakJoinOnSections, --no-trailingCommas, and --spaceAfterExpandedComma to place ONs after JOINs and commas on a new line.

2. Test the pre-commit hook

To test this Git hook, I created an empty repository, saved the above Bash script into a pre-commit file inside the .git/hooks folder, and installed the poor-mans-t-sql-formatter-cli package version 1.6.10.

For the actual SQL file, I used the query to find StackOverflow posts with many “thank you” answers, Source,

select ParentId as [Post Link], count(id)
from posts
where posttypeid = 2 and len(body) <= 200
  and (body like '%hank%')
group by parentid
having count(id) > 1
order by count(id) desc;

This is where all the magic happened when committing the previous SQL file,

Sequence of Git commands to commit a file
Committing a ThankYou.sql file and seeing the magic happening

By the way, I use some Git alias as part of my development workflow.

And this is the formatted SQL file,

SELECT ParentId AS [Post Link]
	, count(id)
FROM posts
WHERE posttypeid = 2
	AND len(body) <= 200
	AND (body LIKE '%hank%')
GROUP BY parentid
HAVING count(id) > 1
ORDER BY count(id) DESC;

Voilà! That’s how to format SQL files automatically with Git. The command line version of Poor Man’s T-SQL Formatter is not that fast. But it’s still faster than copying a SQL file, firing a browser with an online linter, formatting it, and pasting it back.

Poor Man’s T-SQL Formatter might not be perfect, but with a simple change in our script, we can bring any other SQL formatter we can call from the command line.

After this trick, I don’t want to leave or read another comment like “please format this file” during code review.

For more content, check my guide to Code Reviews, my Visual Studio setup for C#, and the lessons I’ve learned as a code reviewer.

Happy coding!

A business case against massive unrequested refactorings

Blindly following coding principles is a bad idea.

“Leave the basecamp cleaner,” “Make the change easy then make the easy change”…

Often, we follow those two principles and start huge refactoring sessions with good intentions but without considering the potential consequences.

Let me share two stories of refactoring sessions that led to unintended consequences and the lesson behind them.

Changing Entities and Value Objects

At a past job, a team member decided to refactor the entire solution before working on his task.

He changed every Domain Entity, Value Object, and database table. What he found wasn’t “scalable” in his experience.

The project was still in its early stage and the rest of the team was waiting for his task.

One week later, we were still discussing about names, folder structure, and the need for that refactoring in the first place.

We all were blocked waiting for him to finish the mess he had created.

Changing Class and Table Names

At another job, our team’s architect decided to work over the weekend.

And the next thing we knew next Monday morning was that almost all class and table names had been changed. The architect decided to rename everything. He simply didn’t like the initial naming conventions. Arrrggg!

We found an email in our inboxes listing the things he had broken along the way.

We spent weeks migrating user data from the old database schema to the new one.

These are two examples of refactoring sessions that went sideways. Nobody asked those guys to change anything in the first place.

Even there was no need or business case for that in the first place.

I have a term for these refactoring sessions: massive unrequested refactoring.

A room with some tools in it
Another massive but unfinished refactoring...Photo by Stefan Lehner on Unsplash

The Need for Refactoring

I’m not saying we shouldn’t refactor our code.

I believe in the “leave the basecamp cleaner than the way you found it” mantra.

But, before embarking on a massive refactoring, let’s ask ourselves if it’s truly necessary and if the team can afford it, not only in terms of money but also time and dependencies.

Often, we get too focused on naming variables, functions, and classes to see the bigger picture and the overall project in perspective.

“Perfect is the enemy of good.”

And if there isn’t a viable alternative, let’s split that massive refactoring into separate, focused, and short Pull Requests that can be reviewed in a single review session without much back and forth.

The best refactorings are the small ones that slowly and incrementally improve the health of the overall project. One step at a time. Not the massive unrequested ones.

Voilà! That’s my take on massive unrequested refactorings. Have you ever done one too? What impact did it have? Did it turn out well? Remember, all code we write should move the project closer to its finish line. Often, massive unrequested refactorings don’t do that.

In my two stories, those refactoring sessions ended up blocking people and creating more work.

These refactorings remind me of the analogy that coding is like living in a house. A massive unrequested refactoring would be like a full home renovation while staying there!

Happy coding!

There's No Such Thing as Job Security: Three Lessons on Layoffs

I’ve been laid off more than once.

I know how it feels. I know that momentary feeling of relief followed by the uncertainty of a “What am I going to do now?”

If you haven’t been living under a rock, I bet you have heard the news about layoffs in the tech industry.

They’re so common these days that there’s even a page to report and track companies laying off their people: layoffs.fyi.

Some days ago, I got a message from a close friend who was laid off. These are three the lessons I’ve learned on layoffs I shared with her.

1. Job security is an illusion

I don’t know who makes us believe there’s such a thing as “job security.” That’s an illusion.

In my early days at college, I thought the safest route was being an employee. I was so wrong! I only needed being laid off once to change my mind.

We could lose our jobs anytime for reasons we don’t and can’t control. A pandemic, a company going bankrupt, or a recession.

Based on layoffs.fyi, the site that tracks layoffs I just told you,

  • 165,269 US tech employees lost their job in 2022,
  • 263,180 in 2023, and
  • 89,193 in 2024 until May.

The real question is when it will happen, not if it will ever happen to us. We’re better off preparing for that.

2. Have an emergency fund

I can’t stress this enough. This one of the things I wished I had learned earlier: have an emergency fund.

An emergency fund is enough savings to cover our essential expenses for some time. The longer, the better.

That’s the breathing room until you figure out something.

And it’s the difference between being picky about the next job or accepting anything to pay the bills.

3. Always be ready

Let’s always have our CVs updated. Stay in touch with our colleagues and ex-coworkers. Build our professional network.

Let’s always be ready for an interview. Have our data structures and “tell me about yourself” muscles in shape.

Interviewing is broken, I know! But let’s always be ready to leave.

Don’t wait for a layoff to establish an online presence and grow your network. By then, it will be too late.

Voilà! Those are my thoughts about layoffs. I learned that after losing a job, there’s always a positive change. That takes us out of our comfort zone. “Pastures are always greener on the other side,” I guess.

For more career lessons, read these five lessons I learned in my first five years as a software engineer and ten lessons learned after one year of remote work.

Happy coding!

Too many layers: My take on Queries and Layers

These days I reviewed a pull request in one of my client’s projects and shared a thought about reading database entities and layering. I believe that project took layering to the extreme. These are my thoughts.

For read-only database-access queries, reduce the number of layers in an application to avoid excessive mapping between layers and unneeded artifacts.

Too many layers, I guess

The pull request I reviewed added a couple of API endpoints to power a report-like screen. These two endpoints only returned data given a combination of parameters. Think of showing all movies released on a date range with 4 or 5 stars. It wasn’t exactly that, but let’s use that example to prove a point.

That project had database entities, domain objects, results wrapping DTOs, and responses. To add a new read-only API endpoint, we would need a request object, query, query handler, and repository.

Inside the repository, we would need to map database entities to domain entities and value objects. Inside the query handler, we would need to return a result object containing a collection of DTOs. Another mapping. Inside the API endpoint, we would need to return a response object. Yet another mapping. I guess you see where I’m going.

This is the call chain of methods I found in that project:

Sequence diagram to read a list of movies
Three layers and even more mappings

And these are all the files we would need to add a new API endpoint and its dependencies:

|-Api/
|---src/
|-----Movies/
|-------MovieQueryApi.cs
|-------GetMoviesQueryResponse.cs
|-Application/
|---src/
|-----Movies/
|-------GetMoviesQuery.cs
|-------GetMoviesQueryHandler.cs
|-------GetMoviesQueryResult.cs
|-------MovieDto.cs
|-Domain/
|---src/
|-----Movies/
|-------Movie.cs
|-------Director.cs
|-------Genre.cs
|-Infrastructure.Contracts/
|---src/
|-----Movies/
|-------IMovieRepository.cs
|-Infrastructure.SqlServer/
|---src/
|-----Movies/
|-------MovieRepository.cs

Technically, the objects inside the Domain were already there. By the way, we can create that folder structure with dotnet cli.

That’s layering to the extreme. All those artifacts and about three mapping methods between layers are waaay too much to only read unprocessed entities from a database. Arrrggg! Too much complexity. We’re only reading data, not loading domain objects to call methods on them.

I believe simple things should be simple to achieve.

Query Services: A simpler alternative

As an alternative to those artifacts and mappings, I like to follow the idea from the book Hands-on Domain-Driven Design with .NET Core.

For read-only queries, the HODDD book uses two models:

  1. Query Models for the request parameters, and
  2. Read Models for the request responses.

Then, it calls the underlying storage mechanism directly from the API layer. Well, that’s too much for my own taste. But I like the simplicity of the idea.

I prefer to use Query Services. They are query handlers that live in the Infrastructure or Persistence layer, call the underlying storage mechanism, and return a read model we pass directly to the API layer. This way, we only have two layers and no mappings between them. We declutter our project from those extra artifacts!

I mean something like this,

Sequence diagram to read a list of movies
Two layers and zero mappings

And something like this,

|-Api/
|---src/
|-----Movies/
|-------MovieQueryApi.cs
|-Application/
|---src/
|-----Movies/
|-------GetMoviesQueryModel.cs
|-------MoviesReadModel.cs
|-Infrastructure.SqlServer/
|---src/
|-----Movies/
|-------GetMoviesQueryService.cs

We put the input and output models in the Application layer since we want the query service in the Infrastructure layer. Although, the HODDD book places the input and output models and data-access code directly in the API layer. Way simpler in any case!

Voilà! That’s my take on read-only queries, layers, and Domain-Driven Design artifacts. I prefer to keep read-only database access simple and use query services to avoid queries, query handlers, repositories, and the mappings between them. What do you think? Do you also find all those layers and artifacts excessive?

If you want to read more content on Domain-Driven Design, check a case of primitive obsession and my takeaways from the book Domain Modeling Made Functional.

Happy coding!

This Is How I'd Start an Ultralearning Project To Become a Software Engineer

Some days ago, I got a message from someone starting his journey to become a Software Engineer. He found my post with the takeaways from the Ultralearning book and asked for feedback.

On the email, my reader explained that he wanted to become a professional Software Engineer with a one-year ultralearning project. Also, he wrote he had a list of resources compiled and already made some progress.

I want to document my reply to help others and preserve my keystrokes.

This is my long reply and how I would start an Ultralearning project:

1. Set milestones

Keep yourself focused and motivated with milestones.

For example, after 2 or 3 months of studying, make sure to complete an introductory CS course or have some features of a coding project ready.

Often we underestimate what we can do in a year or get easily distracted.

2. Choose Math subjects wisely

This might be controversial. But don’t get too focused on learning advanced Math.

Depending on the business domain you’re working on as a Software Engineer, you might not need a lot of Math. Unless you’re working on Computer Graphics, Finance, or Simulations.

I’d stick to courses on Linear Algebra and Math for Computer Science.

3. Use roadmaps as inspiration

Find lists of subjects to learn from roadmaps.

If you search on Google or DuckDuckGo or GitHub “programming roadmap <insert year here>,” you will find good resources. But you don’t need to learn all those subjects at once. Instead, understand how a particular subject or tool fits into the larger picture and when you need it.

Only zoom in when you need a particular tool.

4. Write an end-to-end coding project

Write a coding project that reads data from a webpage, calls a backend, persists data into a relational database, and displays it back.

You will learn a lot from this simple exercise:

  • HTML/CSS,
  • a UI library,
  • HTTP/REST,
  • a backend language,
  • SQL, and
  • a database engine.

Quite a lot!

ou will be surprised by how many “real” applications boil down to read and write data from and to a database.

5. Be consistent

I know this is cliche at this point. But…

Set a regular study time and put it in a calendar. I find the green squares on my GitHub profile inspiring to keep myself in the loop.

6. Learn the tech and tools companies are hiring for

Probably, you will hear or read people arguing to “learn X instead of Y” or “X pays more than Y.”

Instead of looking for the best-paying languages, use a more tactical approach, find what companies around you (or on LinkedIn) are looking for, and learn those subjects.

Or, in any case, it seems there’s always a shortage of COBOL developers. I’ve read they’re well paid.

7. Keep a journal

Keep track of what you learn, the resources you use, and the subjects you find challenging.

You don’t need anything fancy. A simple .txt file works. Sorry, if you were expecting Notion. I’m a plain-text lover.

I found this advice about the journal on the book “Never Stop Learning” by Bradley R. Staats.

Voilà! That’s how I would approach a ultralearning project to become a Software Engineer. My last piece of advice is you don’t need to learn everything at once. In the beginning, learn a handful of tools and learn them well. But don’t be afraid of learning something else. Later you could start expanding your toolbox and finding what you like the most.

I wrote my own roadmap for intermediate C# developers. It points to C# resources, but its overall subject structure works for other languages too. This is not for absolute beginners.

I tried to challenge myself with mini ultralearning projects. I choose to learn enough React and Go in 30 days.

Happy ultralearning!