# ZNProjects

Software is beautiful again!

# AOC 2016 - Day 5

Problem statement on Advent of Code website.

Code for this post can be found on GitHub.

# Background

• On Day 1, we located Easter Bunny headquarters.
• On Day 2, we broke the code to use the bathroom in Easter Bunny HQ.
• On Day 3, we helped the EBs' design department by removing invalid triangles from the list.
• On Day 4, we figured out which rooms in the information kiosk are valid.

On Day 5, we will figure out how to get past security doors by calculating the code.

# Problem - Part 1

For Part 1, we are given two pieces of information.

1. The door ID
2. An increasing integer index, starting at `0`.

For each MD5 hash that starts with `00000` (five `0`'s), the 6th digit indicates the next character of the password. If we find eight hashes that start with five `0`'s, that provides the password.

# Solution - Part 1

My strategy to solve this problem was two-fold.

1. Write an MD5 algorithm in F#.
1. I outlined my solution for a functional MD5 algorithm in F# in a previous blog post.
2. Write a function to collect the required hashes and construct the final answer.

The vast majority of the functionality for this solution is in the MD5 algorithm, which I won't describe here again.

## Collection function

Once I had an MD5 algorithm, I used the `Seq` module and its library functions to collect the required hashes and calculate the door password.

The strategy for the function is simple.

1. Start by producing indices starting at `0`.
2. Map all the strings to hashes.
3. Remove all hashes that don't match the criteria (i.e. hashes that don't start with `00000`).
4. Once we have eight hashes, stop producing indices and get the final answer.
``````let day5part1input = "cxdnnyjw"

let day5part1criteria = "00000"

let day5part1 input crit (md5sum:string -> string) =
Seq.initInfinite (fun i -> input + string i)
|> Seq.map md5sum
|> Seq.filter (fun m -> m.StartsWith crit)
|> Seq.take 8
|> Seq.map (Seq.item 5 >> string)
|> Seq.reduce ( + )``````

When I ran the function, I got `F77A0E6E`, which is the correct answer.

# Problem - Part 2

The problem statement for Part 2 still uses the MD5 function, but adds a twist to the password computation. We still have the same information from the problem.

1. The door ID
2. An increasing integer index, starting at `0`.

We are still only interested in hashes that start with five `0`'s, i.e. `00000`. The sixth character of the hash now indicates the position in the solution and the seventh character is what goes into that position.

If we get multiple hashes that meet the criteria and indicate the same position, all but the first match are ignored. Hashes that meet the criteria but indicate an invalid position, i.e. anything other than `0`-`7`, are also ignored.

# Solution - Part 2

To solve Part 2, I wanted to use a similar approach to Part 1. However, I needed to way to determine when the algorithm had collected enough values to form a complete answer.

My overall strategy was as follows.

1. Start by producing indices starting at `0`.
2. Map all the strings to hashes.
3. Remove all hashes that don't match the criteria.
4. Once I have a character for each position of the answer, stop producing indices and get the final answer.

## Storing the pieces of the answer

I chose to use a `Dictionary<int, string>` collection to hold the values that would form the answer. The keys represent the position in the answer and the values are the single-character strings that, when concatenated, would form the answer.

First, I created a value that would hold the relevant pieces.

``let answer = Dictionary<int, string> 8``

Second, I created an `add` function that would only add values to the `Dictionary` if a value for the key was not already present.

``````let add (d:Dictionary<_,_>) kv =
if not <| d.ContainsKey(fst kv) then
else ()``````

Third, I created a `notComplete` function to check whether I had collected enough values for an answer.

``let notComplete (d:Dictionary<_,_>) = d.Count < 8``

## Collecting the pieces of the answer

Once I had decided on a storage strategy, I wrote a function to add values to the `Dictionary` until I had enough to get an answer.

The overall strategy for this part of the logic depends on the `Seq.takeWhile` function. This function will terminate the infinite sequence generator once the `Dictionary` is "full".

``````Seq.initInfinite (fun i -> input + string i)
|> Seq.map md5sum
|> Seq.filter
(fun m ->
(m.StartsWith crit)
&& (m.[5] = '0' || m.[5] = '1' || m.[5] = '2' || m.[5] = '3'
|| m.[5] = '4' || m.[5] = '5' || m.[5] = '6' || m.[5] = '7'))
|> Seq.map (fun m -> m.[5] |> string |> System.Int32.Parse, m.[6] |> string)
|> Seq.takeWhile (fun _ -> notComplete answer)

As the last step, I pulled the values from the `Dictionary` and concatenated the strings to get the final answer.

I chose to explicitly pull the values based on key lookups because, while there are library functions to get a collection of the values, I did not see any guarantees about the order that the items would be in.

``````[
for i in 0 .. 7 do
]
|> List.reduce ( + )``````

When I ran this function, I got back `999828EC` which is, in fact, the correct answer.

# Some side notes

## Mutable vs. Immutable data structures

As you may have noticed, I am using a mutable data structure to store the needed values for the Part 2 solution. At the time, I struggled to find a "pure" functional way of solving the problem but couldn't come up with something off hand (and, to tell you the truth, I didn't try too hard since I had just finished the MD5 algorithm and wanted to be done with this problem).

However, the Part 2 algorithm can easily be rewritten using a tail-recursive function and immutable data structures. The tail-recursive part is important because, while I don't know how many hashes the algorithm has to go through, I do know that even the OOTB MD5 algorithm takes over 2 minutes to calculate the answer (and my functional MD5 algorithm takes over 7 minutes). From that, I assume that the stack would get very deep if the function was not tail-recursive.

## `Seq` vs. `ParallelSeq`

In order to speed up the calculation, I attempted to use the `ParallelSeq` library to take advantage of my CPU's multiple cores. Using the `ParallelSeq` library did speed up my calculations considerably.

However, one thing I learned is that the `ParallelSeq` library does not take care of putting the calculated values back in their original order. Instead, it is the client's responsibility to track the order of values (if that is important - and in this case it was).

As an example, when I used the ParallelSeq library, the system produced `C9E29889` as the answer for Part 2. However, the correct answer for Part 2 is `999828EC`, which are the same characters but in the correct order.

In the end, I removed `ParallelSeq` from this solution but it is a library that I intend to investigate in the future.

# Lessons learned

Here are the lessons I learned from this exercise.

1. Take the time to implement a solution the way you want, so that you do not later have regrets about it. In this instance, I regret choosing a mutable data structure and imperative algorithm to solve Part 2 and it's something I hope to rectify in the near future.
2. Just as when I implemented the MD5 algorithm in F#, it is critical to be extra vigilant when reading code. Much more so than any other technique, reading and understanding code is the fastest way to find and fix most bugs.

See you next time!

# PS: Solution - Part 2 (Pure functional implementation)

After I wrote this blog post, but before posting it, I decided to rewrite the Part 2 solution using immutable data structures.

Rewriting the function was quite simple, and relied on the following strategy.

1. Define a completion check to terminate the recursive function.
2. Define a tail-recursive function to calculate the 8 characters of the answer.
3. Combine the 8 characters to form the final answer.

Here is the code for the pure functional implementation of Part 2, broken into 3 chunks.

``````let day5part2alt input crit (md5sum : string -> string) =
let complete l = l |> List.filter ((=) "") |> List.isEmpty

let rec collector i acc =
if complete acc then acc
else
let m = md5sum (input + string i)
if (m.StartsWith crit) then
match m.[5] with
| '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' ->
List.mapi (fun idx elt ->
if elt = "" && idx = (m.[5] |> string |> System.Int32.Parse) then
string m.[6]
else elt) acc
|> collector (i + 1)
| _ -> collector (i + 1) acc
else collector (i + 1) acc

[ ""; ""; ""; ""; ""; ""; ""; "" ]
|> collector 0
|> List.reduce (+)``````

Running this function provides the same answer as above, i.e. `999828EC`.

## Performance comparison with `day5part2`

I compared `day5part2` and `day5part2alt` to see whether my new implementation matches up to the previous one.

On average, I received the following results.

`day5part2` vs. `day5part2alt`, raw numbers.
Function Real CPU Gen0 Gen1 Gen2

`day5part2`

174.758

174.689

21623

18

2

`day5part2alt`

166.528

166.437

20590

17

1

And here are the same results, converted to percentages.

`day5part2` vs. `day5part2alt`, percentages.
Function Real CPU Gen0 Gen1 Gen2

`day5part2`

105%

105%

105%

106%

200%

`day5part2alt`

100%

100%

100%

100%

100%

Ignoring the Gen2 result, the mutable version is actually 5% worse across the board, in both processing time and increased garbage collection. It pays to be immutable.

See you next time!