Thursday, November 10, 2016
I want to take this thought a step further, and as implied by the post title, do a group by.
Starting, here is an order by % 2 giving us a list of even and then odd numbers:
- int[] numbers = { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 };
- var orderedNumbers = from n in numbers
- orderby n % 2 == 0 descending
- select n;
- foreach (var g in orderedNumbers)
- {
- Console.Write("{0},", g);
- }
This is all pretty straight forward, order by numbers that when modded by 2 are 0 and we have the numbers 4,8,6,2,0,5,1,3,9,7.
But what if I want to simply have two lists, one with evens and one with odds? That’s where group by comes in.
- int[] numbers = { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 };
- var numberGroups = from n in numbers
- group n by n % 2 into g
- select new { Remainder = g.Key, Numbers = g };
- foreach (var g in numberGroups)
- {
- if(g.Remainder.Equals(0))
- Console.WriteLine("Even Numbers:", g.Remainder);
- else
- Console.WriteLine("Odd Numbers:", g.Remainder);
- foreach (var n in g.Numbers)
- {
- Console.WriteLine(n);
- }
- }
with the output:
- Odd Numbers:
- 5
- 1
- 3
- 9
- 7
- Even Numbers:
- 4
- 8
- 6
- 2
- 0
What’s happening here is that LINQ is using anonymous types to create new dictionary (actually a System.Linq.Enumerable.WhereSelectEnumerableIterator>).
It is important to note here that the key here that everything is keyed on is the first value after the “by”.
Taking this one simple step forward let’s group a bunch of words. The following doesn’t work quite right:
- string[] words = { "blueberry", "Chimpanzee", "abacus", "Banana", "apple", "cheese" };
- var wordGroups = from w in words
- group w by w[0] into g
- select new { FirstLetter = g.Key.ToString().ToLower(), Words = g };
- foreach (var g in wordGroups)
- {
- Console.WriteLine("Words that start with the letter '{0}':", g.FirstLetter);
- foreach (var w in g.Words)
- {
- Console.WriteLine(w);
- }
- }
giving us the output:
- Words that start with the letter 'b':
- blueberry
- Words that start with the letter 'c':
- Chimpanzee
- Words that start with the letter 'a':
- abacus
- apple
- Words that start with the letter 'b':
- Banana
- Words that start with the letter 'c':
- cheese
That’s because there is a bit of a red herring here. Remember that the first value after the by is what is used to group by. In our case w[0] for Chimpanzee is “C”, not c. If we change it to:
- string[] words = { "blueberry", "Chimpanzee", "abacus", "Banana", "apple", "cheese" };
- var wordGroups = from w in words
- group w by w[0].ToString().ToLower() into g
- select new { FirstLetter = g.Key.ToString().ToLower(), Words = g };
- foreach (var g in wordGroups)
- {
- Console.WriteLine("Words that start with the letter '{0}':", g.FirstLetter);
- foreach (var w in g.Words)
- {
- Console.WriteLine(w);
- }
- }
then we get the results we expect with:
- Words that start with the letter 'b':
- blueberry
- Banana
- Words that start with the letter 'c':
- Chimpanzee
- cheese
- Words that start with the letter 'a':
- abacus
- apple
Taking this even one step further we can throw an orderby above the group and order things alphabetically:
- var wordGroups = from w in words
- orderby w[0].ToString().ToLower()
- group w by w[0].ToString().ToLower() into g
- select new { FirstLetter = g.Key.ToString().ToLower(), Words = g };
So let’s now make this a bit over the top complex. Given the classes:
- public class Customer
- {
- public List<Order> Orders { get; set; }
- }
- public class Order
- {
- public DateTime Date { get; set; }
- public int Total { get; set; }
- }
lets group a customer list by customer, then by year, then by month:
- List<Customer> customers = GetCustomerList();
- var customerOrderGroups = from c in customers
- select
- new {c.CompanyName,
- YearGroups = from o in c.Orders
- group o by o.OrderDate.Year into yg
- select
- new {Year = yg.Key,
- MonthGroups = from o in yg
- group o by o.OrderDate.Month into mg
- select new { Month = mg.Key, Orders = mg }
- }
- };
Whew! that took a lot to copy and paste from MSDN’s sample library!
As mentioned previously the important part here is that the keys for these are the first value after the “by”. This just creates a bunch of dictionarys keyed embeded together keyed on the values after the “by”.
As mentioned previously the important part here is that the keys for these are the first value after the “by”. This just creates a bunch of dictionarys keyed embeded together keyed on the values after the “by”.
The GroupBy method that is a part of Linq can also take an IEqualityComparer. Given the comparer:
- public class AnagramEqualityComparer : IEqualityComparer
- {
- public bool Equals(string x, string y)
- {
- return getCanonicalString(x) == getCanonicalString(y);
- }
- public int GetHashCode(string obj)
- {
- return getCanonicalString(obj).GetHashCode();
- }
- private string getCanonicalString(string word)
- {
- char[] wordChars = word.ToCharArray();
- Array.Sort
(wordChars); - return new string(wordChars);
- }
- }
we can find all the matching anagrams. This is possible because the IEqualityComparer compares words based on a sorted array of characters. If you take “meat” and “team” they both become “aemt” when sorted by their characters.
- string[] anagrams = { "from", "salt", "earn", "last", "near", "form" };
- var orderGroups = anagrams.GroupBy(
- w => w.Trim(),
- a => a.ToUpper(),
- new AnagramEqualityComparer()
- );
- foreach (var group in orderGroups)
- {
- Console.WriteLine("For the word "{0}" we found matches to:", group.Key);
- foreach (var word in group)
- {
- Console.WriteLine(word);
- }
- }
Subscribe to:
Posts (Atom)