Last month I followed the NDC Conference on YouTube. In this Monday Links episode, I share some of the conferences I watched and liked. I don’t know why but I watched presentations about failures, aviation disasters, and software mistakes. Well, two of the 5 links aren’t about that. Enjoy!
Improve working across time zones
Prefer document-based over meeting-based documentation. Only schedule meetings for discussions and have a clear agenda for everyone to review before the meeting. After the meeting, share the conclusions with people in different time zones who couldn’t join. Read full article
Mayday! Software lessons from aviation disasters
This is a conference from NDC. It shows two case studies from aviation disasters and how they relate to software engineering. For the first case study, after an incident, a security expert asked his team these questions to identify the cause of the incident:
How can I prove myself wrong?
What details might I be ignoring because it doesn’t fit my theory or solution?
What else could cause this issue or situation?
Experts traced the root of the incident ten years before the crash: counterfeit parts. This makes us wonder about counterfeit code: code we copy from StackOverflow, blogs, and documentation. We’re responsible for every line of code we write, even for the ones we copy and paste.
The second case study teaches us some good lessons about communication.
Failure is Always an Option
From space accidents to the British Post Office to a Kenya money transfer company, this talk shows how new businesses and branches of Science come out of failures and unanticipated usages of systems. Inspired by and contradicting one line in the Apollo 13 movie, “Failure is not an option.”
This talk claims that the single point of failure of modern cloud-based solutions is the credit card paying the cloud provider. LOL!
Hacking C#: Development for the Truly Lazy
This talk shows a bag of tricks to make code more readable. It shows how to use C# extension methods to remove duplication. Also, it presents the “Commandments of Extension Methods:”
No business logic
Keep them as small as possible
Keep them generic, so you can use them with any object
Keep them portable
Use them where there is boring and repetitive code
Make them useful
Ah! I learned we can make indexers receive multiple indexes. Like something[1, 3, 5].
Programming’s Greatest Mistakes
I had a coworker that always said: “Nobody is going to die,” when somebody else was reluctant to change some code. It turned out we weren’t working on a medical or aerospatial domain. But often, oops cause businesses to lose money. I bet you have taken down servers because of an unoptimized SQL query. That happened to a friend of a friend of mine. Wink, wink!
It starts by showing one stupid mistake the author made in his early days using a sarcastic name for one of his support tools. The support team ended up shipping it to their clients. Y2K, a missing using in a mission-critical software, null, and other mistakes.
Voilà! Do you also follow the NDC Conference? What are your own programming’s greatest mistakes? Don’t be ashamed. All of us have one. Until next Monday Links!
So far we have covered some of the most common LINQ methods. This time let’s cover three LINQ methods that work like set operations: Intersect, Union, and Except.
Like the Aggregate method, we don’t use these methods every day, but they will come in handy from time to time.
1. Intersect
Intersect() finds the common elements between two collections.
Let’s find the movies we both have watched and rated in our catalogs.
varmine=newList<Movie>{// We have not exactly a tie here...newMovie("Terminator 2",1991,4.7f),// ^^^^^^^^^^^^^^newMovie("Titanic",1998,4.5f),newMovie("The Fifth Element",1997,4.6f),newMovie("My Neighbor Totoro",1988,5f)// ^^^^^^^^^^^^^^^^^^^^};varyours=newList<Movie>{newMovie("My Neighbor Totoro",1988,5f),// ^^^^^^^^^^^^^^^^^^^^newMovie("Pulp Fiction",1994,4.3f),newMovie("Forrest Gump",1994,4.3f),// We have not exactly a tie here...newMovie("Terminator 2",1991,5f)// ^^^^^^^^^^^^^^};varweBothHaveSeen=mine.Intersect(yours);Console.WriteLine("We both have seen:");PrintMovies(weBothHaveSeen);// Output:// We both have seen:// My Neighbor TotorostaticvoidPrintMovies(IEnumerable<Movie>movies){Console.WriteLine(string.Join(",",movies.Select(movie=>movie.Name)));}recordMovie(stringName,intReleaseYear,floatRating);
This time, we have two lists of movies, mine and yours, with the ones I’ve watched and the ones you have watched, respectively. Also, we both have watched “My Neighbor Totoro” and “Terminator 2.”
To find the movies we both have seen (the intersection between our two catalogs), we used Intersect().
But, our example only shows “My Neighbor Totoro.” What happened here?
If we pay close attention, we both have watched “Terminator 2,” but we gave it different ratings. Since we’re using records from C# 9.0, records have member-wise comparison. Therefore, our two “Terminator 2” instances aren’t exactly the same, even though they have the same name. That’s why Intersect() doesn’t return it.
To find the common movies using only the movie name, we can:
pass a custom comparer to Intersect(),
override the default Equals() and GetHashCode() methods of the Movie record, or,
varweBothHaveSeen=mine.IntersectBy(yours.Select(yours=>yours.Name),// ^^^^^^// Your movie names(movie)=>movie.Name);// ^^^^// keySelector: Property to compare byConsole.WriteLine("We both have seen:");PrintMovies(weBothHaveSeen);// Output:// We both have seen:// Terminator 2,My Neighbor Totoro
Unlike Intersect(), IntersectBy() expects a “keySelector,” a delegate with the property to use as the comparing key, and a second collection with the same type as the keySelector.
Union() finds the elements from both collections without duplicates.
Let’s find all the movies we have in our catalogs.
varmine=newList<Movie>{newMovie("Terminator 2",1991,5f),// ^^^^^^^^^^^^^^newMovie("Titanic",1998,4.5f),newMovie("The Fifth Element",1997,4.6f),newMovie("My Neighbor Totoro",1988,5f)// ^^^^^^^^^^^^^^^^^^^^};varyours=newList<Movie>{newMovie("My Neighbor Totoro",1988,5f),// ^^^^^^^^^^^^^^^^^^^^newMovie("Pulp Fiction",1994,4.3f),newMovie("Forrest Gump",1994,4.3f),newMovie("Terminator 2",1991,5f)// ^^^^^^^^^^^^^^};varallTheMoviesWeHaveSeen=mine.Union(yours);Console.WriteLine("All the movies we have seen:");PrintMovies(allTheMoviesWeHaveSeen);// Output:// All the movies we have seen:// Terminator 2,Titanic,The Fifth Element,My Neighbor Totoro,Pulp Fiction,Forrest GumpstaticvoidPrintMovies(IEnumerable<Movie>movies){Console.WriteLine(string.Join(",",movies.Select(movie=>movie.Name)));}recordMovie(stringName,intReleaseYear,floatRating);
This time we gave the same rating to our shared movies: “Terminator 2” and “My Neighbor Totoro.” And, Union() showed all the movies from both collections, showing duplicates only once.
Union() works the same way as the union operation in our Math classes.
LINQ has a similar method to “combine” two collections into a single one: Concat(). But, unlike Union(), Concat() returns all elements from both collections without removing the duplicated ones.
.NET 6.0 also has a UnionBy() method to “union” two collections with a keySelector. And, unlike IntersectBy(), we don’t need the second collection to have the same type as the keySelector.
3. Except
Except() finds the elements in one collection that are not present in another one.
This time, let’s find the movies only I have watched.
varmine=newList<Movie>{newMovie("Terminator 2",1991,5f),newMovie("Titanic",1998,4.5f),// ^^^^^^^newMovie("The Fifth Element",1997,4.6f),// ^^^^^^^^^^^^^^^^^newMovie("My Neighbor Totoro",1988,5f)};varyours=newList<Movie>{newMovie("My Neighbor Totoro",1988,5f),newMovie("Pulp Fiction",1994,4.3f),newMovie("Forrest Gump",1994,4.3f),newMovie("Terminator 2",1991,5f)};varonlyIHaveSeen=mine.Except(yours);Console.WriteLine();Console.WriteLine("Only I have seen:");PrintMovies(onlyIHaveSeen);// Output:// Only I have seen:// Titanic,The Fifth ElementstaticvoidPrintMovies(IEnumerable<Movie>movies){Console.WriteLine(string.Join(",",movies.Select(movie=>movie.Name)));}recordMovie(stringName,intReleaseYear,floatRating);
With Except(), we found the movies in mine that are not in yours.
When working with Except(), we should pay attention to the order of the collection because this method isn’t commutative. This means, mine.Except(yours) is not the same as yours.Except(mine).
Likewise, we have ExceptBy() that receives a KeySelector and a second collection with the same type as the keySelector type.
Voilà! These are the Intersect(), Union(), and Except() methods. They work like the Math set operations: intersection, union, and symmetrical difference, respectively. Of the three, I’d say Except is the most common method.
Want to write more expressive code for collections? Join my course, Getting Started with LINQ on Udemy and learn everything you need to know to start working productively with LINQ—in less than 2 hours.
I bet you have used the SQL LIKE operator to find a keyword in a text field. For large amounts of text, that would be slow. Let’s learn how to implement a full-text search with Lucene and NCache.
What is Full-Text Search?
Full-text search is a technique to search not only exact matches of a keyword in some text but for patterns of text, synonyms, or close words in large amounts of text.
To support large amounts of text, searching is divided into two phases: indexing and searching. In the indexing phase, an analyzer processes text to create indexes based on the rules of a spoken language like English to remove stop words and record synonyms and inflections of words. Then, the searching phase only uses the indexes instead of the original text source.
Full-Text Search with Lucene and NCache
1. Why Lucene and NCache?
From its official page, “Apache Lucene.NET is a high performance search library for .NET.” It’s a C# port of Java-based Apache Lucene, an “extremely powerful” and fast search library optimized for full-text search.
NCache gives distributed capabilities to Lucene by implementing the Lucene API on top of its In-Memory Distributed cache. This way, NCache makes Lucene a linearly scalable full-text searching solution for .NET. For more features of Distributed Lucene, check NCache Distributed Lucene page.
Lucene stores data in immutable “segments,” which consist of multiple files. We can store these segments in our local file system or in RAM. But, since we’re using Lucene with NCache, we’re storing these segments in NCache.
Before indexing and searching anything, first, we need to create a Distributed Lucene Cache. Let’s navigate to http://localhost:8251 to fire NCache Web Manager and add a New Distributed Cache.
Let’s select “Distributed Lucene” in the Store Type and give it a name. Then, let’s add our own machine and a second node. For write operations, we need at least two nodes. We can stick to the defaults for the other options.
By default, in Windows machines, NCache stores Lucene indexes in C:\ProgramData\ncache\lucene-index.
After creating the Distributed Lucene cache, let’s populate our Lucene indexes with some movies from a Console app. Later, we will search them from another Console app.
First, let’s create a Console app to load some movies to the Lucene Cache. Also, let’s install the Lucene.Net.NCache NuGet package.
In the Program.cs file, we could load all movies we want to index from a database or another store. For example, let’s use a list of movies from IMDb. Something like this,
usingSearchMovies.Shared;usingSearchMovies.Shared.Entities;usingSearchMovies.Shared.Services;varsearchService=newSearchService(Config.CacheName);searchService.LoadMovies(SomeMoviesFromImdb());Console.WriteLine("Press any key to continue...");Console.ReadKey();// This list of movies was taken from IMDb dump// See: https://www.imdb.com/interfaces/staticIEnumerable<Movie>SomeMoviesFromImdb(){returnnewList<Movie>{newMovie("Caged Fury",1983,3.8f,89,newDirector("Maurizio Angeloni",1959),new[]{Genre.Crime,Genre.Drama}),newMovie("Bad Posture",2011,6.5f,93,newDirector("Jack Smith",1932),new[]{Genre.Drama,Genre.Romance}),newMovie("My Flying Wife",1991,5.5f,91,newDirector("Franz Bi",1899),new[]{Genre.Action,Genre.Comedy,Genre.Fantasy}),newMovie("Modern Love",1990,5.2f,105,newDirector("Sophie Carlhian",1962),new[]{Genre.Comedy}),newMovie("Sins",2012,2.3f,84,newDirector("Pierre Huyghe",1962),new[]{Genre.Action,Genre.Thriller})// Some other movies here...};}
Notice we used a SearchService to handle the index creation in a method called LoadMovies(). Let’s take a look at it.
usingLucene.Net.Analysis.Standard;usingLucene.Net.Index;usingLucene.Net.Store;usingLucene.Net.Util;usingSearchMovies.Shared.Entities;usingSearchMovies.Shared.Extensions;namespaceSearchMovies.Shared.Services;publicclassSearchService{privateconststringIndexName="movies";privateconstLuceneVersionluceneVersion=LuceneVersion.LUCENE_48;privatereadonlystring_cacheName;publicSearchService(stringcacheName){_cacheName=cacheName;}publicvoidLoadMovies(IEnumerable<Movie>movies){usingvarindexDirectory=NCacheDirectory.Open(_cacheName,IndexName);// 1. Opening directory ^^^varstandardAnalyzer=newStandardAnalyzer(luceneVersion);varindexConfig=newIndexWriterConfig(luceneVersion,standardAnalyzer){OpenMode=OpenMode.CREATE};usingvarwriter=newIndexWriter(indexDirectory,indexConfig);// 2. Creating a writer ^^^foreach(varmovieinmovies){vardoc=movie.MapToLuceneDocument();writer.AddDocument(doc);// ^^^^^^^^^^^// 3. Adding a document}writer.Commit();// ^^^^^^// 4. Writing documents}}
A bit of background first, Lucene uses documents as the unit of search and index. Documents can have many fields, and we don’t need a schema to store them.
We can search documents using any field. Lucene will only return those with that field and matching data. For more details on some Lucene internals, check its Lucene Quick Start guide.
Notice we started our LoadMovies by opening an NCache directory. We needed the same cache name we configured before and an index name. Then we created an IndexWriter with our directory and some configurations, like a Lucene version, an analyzer, and an open mode.
Then, we looped through our movies and created a Lucene document for each one using the MapToLuceneDocument() extension method. Here it is,
To create Lucene documents, we used two fields of type TextField: movie name and director name. For each field, we need a name and a value to index. We will use the field names later to create a response object from search results.
There are two basic field types for Lucene documents: TextField and StringField. The first one has support for Full-Text search and the second one supports searching for exact matches.
Once we called the Commit() method, NCache stored our movies in a distributed index.
4. Full-Text Searching Movies
Now that we populated our index with some movies, to search them, let’s create another Console app to read a Lucene query.
Again, let’s use the same SearchService, this time with a SearchByNames() method passing a Lucene query.
usingLucene.Net.Analysis.Standard;usingLucene.Net.Index;usingLucene.Net.QueryParsers.Classic;usingLucene.Net.Search;usingLucene.Net.Store;usingLucene.Net.Util;usingSearchMovies.Shared.Entities;usingSearchMovies.Shared.Extensions;usingSearchMovies.Shared.Responses;namespaceSearchMovies.Shared.Services;publicclassSearchService{// Same SearchService as before...publicIEnumerable<MovieResponse>SearchByNames(stringsearchQuery){usingvarindexDirectory=NCacheDirectory.Open(_cacheName,IndexName);usingvarreader=DirectoryReader.Open(indexDirectory);// ^^^^^^^^^^^^^^^// 1. Creating a readervarsearcher=newIndexSearcher(reader);varanalyzer=newStandardAnalyzer(luceneVersion);varparser=newQueryParser(luceneVersion,"name",analyzer);varquery=parser.Parse(searchQuery);// ^^^^^^// 2. Parsing a Lucene query vardocuments=searcher.Search(query,10);// ^^^^^^^^// 3. Searching documentsvarresult=newList<MovieResponse>();for(inti=0;i<documents.TotalHits;i++){vardocument=searcher.Doc(documents.ScoreDocs[i].Doc);result.Add(document.MapToMovieResponse());// ^^^// 4. Populating a result object}returnresult;}}
This time, instead of creating an IndexWriter, we used a DirectoryReader and a query parser with the same Lucene version and analyzer. Then, we used the Search() method with the parsed query and a result count. The next step was to loop through the results and create a response object.
To create a response object from a Lucene document, we used the MapToMovieResponse(). Here it is,
This time, we used the Get() method with the same field names as before to retrieve fields from documents.
For example, let’s find all movies whose director’s name contains “ca”, with the query directorName:ca*,
Movies with director name contains 'ca'
Of course, there are more keywords in Lucene Query Syntaxt than the ones we used here.
Voilà! That’s how to use Distributed Lucene with NCache. If we already have an implementation with Lucene.NET, we would need few code changes to migrate it to Lucene with NCache. Also, notice that NCache doesn’t implement all Lucene methods.
To follow along with the code we wrote in this post, check my Ncache Demo repository over on GitHub.
In case you find the Monday Links series for the first time: these are five links from past weeks that I found interesting (and worth sharing) while procastinating surfing the Web. This is not a link-building scheme, I only read and liked these articles.
The Secret Art of Storytelling in Programming by Yehonathan Sharvit
This presentation starts with the author sharing his struggle to read books as a kid. And later read code as a programmer and contracts as a consultant.
The main message from this presentation is how memory, attention, and structure spans relate to coding. The author presents three coding style principles that respect mind spans:
Use small functions
Make every line in a function have the same level of abstraction
Interviewing is broken. We all agree. But we don’t know how to fix it. Brain teasers, IQ tests, pair programming, algorithms? This post presents an alternative: inspect the candidate GitHub and public work, ask him to review some piece of code, add unit tests or do some refactors. That sounds like a better idea! Read full article
One of the things we learn while working for a company is office politics. Basically, how to say things and to step away from certain situations. I learned from a coworker to say “I don’t have enough information to answer that question” instead of a simple “I don’t know.” You will find more lines like that one in this post. Read full article
How to feel engaged at work: a software engineer’s guide
Let’s be honest. It’s rewarding when we see the impact of our work. But, often, all days look almost the same. Another JIRA ticket for a production issue. Another meeting that could have been an email. This article shows four ideas to spice things up. Read full article
The Toxic Grind
This article talks about success, hard work, and work-life balance. This is my favorite line: “We should glorify the journey of achieving something meaningful, not the dream of wealth and power. Glorify the skills you build along the way, not the shortcuts you take.”Read full article
Voilà! Another Monday Links. What do you do to feel engaged at your work? Have you been asked to solve LeetCode questions during interviews? Would you like to do something different in future interviews?
This is not one of the most used LINQ methods. We won’t use it every day. But, it’s handy for some scenarios. Let’s learn how to use the Aggregate method.
The Aggregate method applies a function on a collection carrying the result to the next element. It “aggregates” the result of a function over a collection.
The Aggregate method takes two parameters: a seed and an aggregating function that takes the accumulated value and one element from the collection.
How does Aggregate work?
Let’s reinvent the wheel to understand Aggregate by finding the maximum rating in our movie catalog. Of course, LINQ has a Max method. And, .NET 6 introduced new LINQ methods, among those: MaxBy.
varmovies=newList<Movie>{newMovie("Titanic",1998,4.5f),newMovie("The Fifth Element",1997,4.6f),newMovie("Terminator 2",1991,4.7f),newMovie("Avatar",2009,5),newMovie("Platoon",1986,4),newMovie("My Neighbor Totoro",1988,5)};varmaxRating=movies.Aggregate(0f,(maxSoFar,movie)=>MaxBetween(maxSoFar,movie.Rating));// ^^^^^^^^^Console.WriteLine($"Maximum rating on our catalog: {maxRating}");// Output:// Comparing 0 and 4.5// Comparing 4.5 and 4.6// Comparing 4.6 and 4.7// Comparing 4.7 and 5// Comparing 5 and 4// Comparing 5 and 5// Maximum rating on our catalog: 5Console.ReadKey();floatMaxBetween(floatmaxSoFar,floatrating){Console.WriteLine($"Comparing {maxSoFar} and {rating}");returnrating>maxSoFar?rating:maxSoFar;}recordMovie(stringName,intReleaseYear,floatRating);
Notice we used Aggregate() with two parameters: 0f as the seed and the delegate (maxSoFar, movie) => MaxBetween(maxSoFar, movie.Rating) as the aggregating function. maxSoFar is the accumulated value from previous iterations, and movie is the current movie while Aggregate iterates over our list. The MaxBetween() method returns the maximum between two numbers.
Notice the order of the debugging messages we printed every time we compare two ratings in the MaxBetween() method.
On the first iteration, the Aggregate() method executes the MaxBetween() aggregating function using the seed (0f) and the first element (“Titanic” with 4.5) as parameters.
Aggregate first iteration
Next, it calls MaxBetween() with the previous result (4.5) as the maxSoFar and the next element of the collection (“The Fifth Element” with 4.6f).
Aggregate second iteration
In the last iteration, Aggregate() finds the maxSoFar from all previous iterations and the last element (“My Neighbor Totoro” with 5). And it returns the last value of maxSoFar as a result.
Aggregate last iteration
In our example, we used Aggregate() with a seed. But, Aggregate() has an overload without it, then it uses the first element of the collection as the seed. Also, Aggregate() has another parameter to transform the result before returning it.
Voilà! That’s how the Aggregate method works. Remember, it returns an aggregated value from a collection instead of another collection. This is one of those methods we don’t use often. I’ve used it only a couple of times. One of them was in my parsing library, Parsinator, to apply a list of modification functions on the same input object here.
Want to write more expressive code for collections? Join my course, Getting Started with LINQ on Udemy and learn everything you need to know to start working productively with LINQ—in less than 2 hours.