Trigger CI/CD Rebuild Without Trivial Changes

Recently, someone on my team asked how they could trigger a rebuild of a branch on our continuous delivery/integration agent.

The initial suggestion was to introduce a trivial change - such as adding/removing whitespace to the README. This is a viable option, and one I would have also suggested. At least, until I saw a method that does not require any changes.

I could not remember the specifics of this method though. To make matters worse, I was having trouble locating the information again. This post is to prevent this situation from happening again.

How

Simply put, you tell git to allow empty change sets.

1
git commit --allow-empty -m "Trigger Rebuild"

What other useful git nuggets exist that people may not be aware of?

References

I’ve used git for years, but I just realized I can make empty commits to trigger CI pipelines:

git commit --allow-empty -m "Updated upstream code"

- Major Hayden


Usually recording a commit that has the exact same tree as its sole parent commit is a mistake, and the command prevents you from making such a commit. This option bypasses the safety, and is primarily for use by foreign SCM interface scripts.

- git-scm

2019 Year in Review

It is time to look over 2019 to see if or how I have grown, what or where I put my focus, and determine whether or not I need to re-align myself for the coming year.

Blog

I migrated from BitBucket Pages to GitLab Pages for various reasons. There were a few hiccups along the way but overall it seems to have been a pretty smooth transition.

Part of my goal after the migration was to consistently produce content. Perhaps I was too ambitious with my goals/expectations. I was able to produce consistently throughout October but fell off in November due to numerous issues and frustrations with my Raspberry Pi-Hole project.

Over this next year, I would like to keep the goal of producing content regularly and personalizing and/or creating a blog theme. At this time, I want to focus on the content to be more how-to’s, and notes, and improve my communication.

Books

One of my goals for 2019 was to read 12 books. I thought a book a month was reasonable and achievable. And for the first quarter of the year, it seemed like it was going to be. I was able to make my way through the following:

I am not sure what happened after that to cause me to get off track. I started reading Getting Things Done to re-emphasize some of the techniques he provides. The irony is not lost on me that this was the book to break my trend.

Although important, I am not sure I want to have a reading goal for 2020 - except for finishing Getting Things Done.

Pluralsight Courses

PluralSight may be part of the reason I was unable to complete my reading goal for 2019. Throughout 2019, I completed the following PluralSight Courses and started several others:

I planned on creating posts reviewing each one containing my notes to keep them in a central location. Unfortunately, I was not happy with the format/quality of the few courses that I did create reviews for. While I still want to create those entries, I need to figure out how to communicate the review effectively.

I also took the C# Skill IQ Evaluation and scored in the 95th percentile with a 248. I plan on improving this score in 2020, but also want to take courses to build other skills.

Programming Languages

I have worked with C# for a decade. While I love the language, I have started to feel as though the problem space has become stagnant and repetitive. It seems as though I am not the only one feeling this way either.

This is not to say that I think .NET is dying/dead. I still very much enjoy it and have much left to learn about it - particularly with the quality of life changes .NET Core is providing. I simply want to expand my thinking and skill set and learning a new programming language may be better suited to that.

Specifically, I am considering Go and Functional Programming. Begrudgingly Pragmatically, I am considering JavaScript/TypeScript and React/React Native.

Projects

I have no shortage of projects. My problem is in finishing them and/or making them public.

One of those projects this year was my Raspberry Pi-Hole, but have not circled back to it yet. Once that project is completed, I plan on making a Raspberry Pi Development Server. The idea is for it to contain a dockerize Jenkins, Redmine, SonarQube, and/or other software used in my development lifecycle. It is portable enough that it can be brought with me on the go or can be configured with a VPN and accessed remotely.

In December, I started JHache.Enums to experiment with BenchmarkDotNet and write high-performance C# code.

Software Setup

On a day-to-day basis, I use:

  • Visual Studio
    • CodeMaid
    • Editor Guidelines
    • File Icons
    • File Nesting
    • Power Commands
    • Productivity Power Tools
    • Roslynator
    • Shrink Empty Lines
    • SonarLint
    • StyleCop
    • VSColorOutput
  • Visual Studio Code
  • SourceTree
  • LINQPad

I started trying to learn Rider.

With the announcement that Google Chrome would be making it more difficult for ad-blockers, I looked at alternatives. I tried Brave and Vivaldi, but due to several issues, I am switching back to Firefox.

I looked at Fork and GitKraken as SourceTree replacements. GitKraken is my favorite, if I only had a single account it would probably be my daily driver. Overall, Fork looks like a good replacement for SourceTree but I have not spent enough time with it. I can say its merge tool is one of the best.

For productivity, I settled on TickTick for task management and Dynalist as my work journal.

I gave up on trying to get HyperJS to work the way I wanted. Initially, I tried Windows Terminal but ended up returning to Cmder.

I am starting to use Docker for Windows but still need more exposure to using it.

Tech Setup

I upgraded my wife’s computer so she could do her design schoolwork. Given the programs, she needs to run and that she does not game too much I opted for an AMD Ryzen 3700X and NVIDIA 2070 Super. She also got a Secret Lab Omega

My computer (~10 years old) and desk are due for an upgrade this year. Additionally, we are looking to soundproof my office to get a streaming setup started.

Work

I accepted a new opportunity working on a WPF Prism application (as evidenced by blog posts and Pluralsight history).

Lacking

Going over this year I realize three areas are lacking: family, relaxation, and exercise. These are areas I will need to make time to focus on in 2020.

AdGuard Home Initial Setup

Introduction

I have been struggling to get ArchARM set up on my Raspberry Pi. I believe I have identified the root cause of my current issue; however, I could be wrong or still have more issues to troubleshoot and resolve. Without the Raspberry Pi, I am unable to configure my router to use Pi-Hole or AdGuard Home as a network-wide ad blocker.

Fortunately, AdGuard Home has a few servers that I can use to see whether or not it blocks ads until I can get my Raspberry Pi set up. Unfortunately, there will not be any back-end dashboard to view to administer block lists or see blocked traffic. Since ads are so prevalent these days I am not worried about wondering if it is blocking ads.

Baseline

To gauge the effectiveness, a few sites have to be checked before the settings are applied.

eBay

Forbes

CNN

Ads-Blocker

Setup

The next step is to configure the router’s DNS to use AdGuard Home’s DNS servers. AdGuard Home’s Knowledge Base provides the steps to accomplish this:

  1. Open the preferences for your router Usually you can access it from your browser via a URL (like http://192.168.0.1/ or http://192.168.1.1/). You may be asked to enter the password. If you don’t remember it, you can often reset the password by pressing a button on the router itself. Some routers have a specific application which should be already installed on your computer in that case.

  2. Find the DNS settings Look for the ’DNS’ letters next to a field which allows two or three sets of numbers, each broken into four groups of one to three numbers.

  3. Enter our DNS server addresses there 176.103.130.130 176.103.130.131

I have a LinkSys WRT32 router and found this under Advanced SettingsInternet Connection Settings.

All I had to do was change this to Custom and provide the DNS server addresses.

Verification

All that is left to see is how it performs using the same sites as before - using a hard refresh to prevent cached ads from being served.

eBay Verification

Forbes Verification

CNN Verification

Ads-Blocker Verification

Conclusion

It seems as though some ads are being blocked but I am surprised at how many are still being served. To me, this looks like the ad-blocking lists need to be updated but cannot say for certain without installing it myself.

I still will be loading Pi-Hole or AdGuard Home on my Raspberry Pi as this keeps my data in-house. As much as I love what I have seen of AdGuard Home’s admin dashboard, this experience is not reassuring me of its effectiveness. In the end, the more effective product will be the one I use.

GitLab Pages Theme Submodules

Introduction

At the end of the GitLab Pages Setup post, I described an issue I encountered with GitLab CI generating the static site. The issue is caused by Hexo themes being linked as Git submodules - which means the default theme configuration is used. At the time, I only had a single idea - but was not sure whether or not it would work. Since then I have had a few more ideas about possible solutions and this post will describe them.

Overwrite Default Theme Configuration

The most straightforward approach was to add the modified configuration file to the repository. Using the GitLab build configuration, the configuration file could be copied to the themes folder overwriting the default theme configuration.

The only benefit of this approach is its simplicity. Adding the file is trivial and the modified GitLab build configuration would look something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
image: node:10.15.3

variables:
GIT_SUBMODULE_STRATEGY: recursive

cache:
paths:
- node_modules/

before_script:
- npm install hexo-cli -g
- test -e package.json && npm install
- mv themes/_config.yml themes/icarus/_config.yml
- hexo generate

pages:
script:
- hexo generate
artifacts:
paths:
- public
only:
- master

In case it isn’t clear because it’s only a line, the change is:

1
- mv themes/_config.yml themes/icarus/_config.yml

This approach has a few problems though. The first inconvenience is if the theme allows the configuration of branding images or avatars that utilize a path in the theme structure these files must also be copied or moved into the theme after the submodule has been initialized. This could probably be overcome by mirroring the structure of the theme and modifying the command to move a folder into the theme during the build process. Changing themes requires updating the theme config file, the GitLab build configuration file, and adding the theme as a submodule reference. Most troubling however is that if the submodule is updated, there is no way to detect conflicts or updates that may break the theme except at runtime. Overall, this does not seem like a good approach anymore.

Include Theme Contents

The second option could be to include the theme contents directly instead of referencing them as a submodule. Doing so would eliminate the first two issues described with the previous method above. However, it still suffers from the final issue, although slightly different. Updates now are a completely manual process, which will likely entail overwriting the files - potentially leaving orphaned files. Additionally, it also adds bloat to the Hexo content repository which probably is not necessary. Overall, this solution is better than the previous one but still is less than ideal.

Fork and Update

The final idea I had is to fork the theme being used. Updates can be made to this new repository that is specific to the Hexo instance. Updates can be applied by merging the original master repository into this forked copy. If the update changes something unexpected, a conflict will occur that the user will have to resolve to finish the merge. The Hexo content repository would then have the Git submodule reference the forked copy. Another great part about this is that the submodule can be edited directly and changes committed for it from the parent module.

Summary

I think the final solution eliminates most of the major risks associated with the other options and is what I will be using. I can even make my forked repository private and have the GitLab runner still able to access it thanks to GitLab’s CI build permissions. The only differences are that my submodule name will need to be the project name and the URL will need to be changed from an absolute URL to a relative URL:

1
2
3
[submodule "hexo-theme-icarus"]
path = themes/icarus
url=../hexo-theme-icarus.git

The other solutions should work fine but they seemed wrong for one reason or another. Choose whichever option has risks that you can live with.

GitLab Pages Setup

Introduction

Depending on the referrer (Twitter), it may be easy to find out that I used to publish articles on BitBucket pages. I initially chose BitBucket pages because they were the only repository provider that supported free Private Repositories. However, I am looking at alternatives because BitBucket does not support domain forwarding (anymore) for BitBucket pages.

This is a problem because if BitBucket were to go out of business (which doesn’t seem likely but hypothetically) then all my links die with them. While I appreciate the ‘free’ server, I am also not keen on my URL structure being a sub-domain of theirs - it just seems unprofessional of me. This is why I never really considered options like WordPress.com or BlogSpot. There is nothing wrong with these companies or the services they provide. On the contrary, it is great that they provide these services because it opens up options for people to choose from.

My biggest issue though is if I were to switch service providers (which I am doing), I now have to either go edit all previously published links to use the new URL or abandon them. Either option is not a good option. This is what prompted me to look into service providers that support domain forwarding so that the platform I publish my articles to is irrelevant.

The two big contenders I know about are GitHub Pages and GitLab Pages.

Why GitLab?

I ended up settling on GitLab because I am more familiar with their organization setup and I wanted to get some exposure to their CI/CD pipeline. GitHub may be a viable option for CI/CD with the release of GitHub Actions but as of this time, it is still in beta. GitHub’s major appeal for me was how simple it seems to set up a custom domain and their documentation is superb. I had a few issues with GitLab, some of which seemed to stem from NameCheap, my domain registrar.

GitLab does have static site generator templates that can be forked as a starting point.

While you can create a project from scratch, let’s keep it simple and fork on of your favorite example projects to get a quick start. GitLab Pages works with any static site generator.

This is also visible when creating a new project:

My pain may have been less had I used one of these.

Setup

The first step is to create a new repository project on GitLab. This project should follow the naming convention of organization/user.gitlab.io As embarrassing as it is, I have to admit that this was my first mistake. I am not sure why, since BitBucket follows a similar naming convention but for whatever reason, I named the project jhache.me.

In case others make this same mistake, it is not the end of the world. Go into General Settings and rename the project.

This does not change the URL however, so it may be a good idea to update this before progressing as well. In the General Settings area expand the Advanced section and look for the Change Path section taking note of the warnings:

This should allow the project to be referenced directly with the GitLab URL (such as www.gitlab.com/jhache/jhache.gitlab.io). I am not sure if this is necessary though since I was troubleshooting 404 errors that may have been caused by the next step not being run.

With the repository created, the local Hexo folder should be committed and pushed. This differs from my previous workflow on BitBucket where I had a repository containing the Hexo folder and another repository containing only the site content that I would hexo deploy to. Since I want to leverage the GitLab CI/CD this is a necessary change.

CI/CD

GitLab CI/CD uses a YAML configuration file called .gitlab-ci.yml, much like Appveyor or TravisCI. The file tells the CI/CD pipeline how to build the repository. This file can be copied from the Hexo Pages Template:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
image: node:10.15.3

cache:
paths:
- node_modules/

before_script:
- npm install hexo-cli -g
- test -e package.json && npm install
- hexo generate

pages:
script:
- hexo generate
artifacts:
paths:
- public
only:
- master

Note: If this is a migration (as in my case) from another service provider, be sure to commit the contents of your local repository (if you want to save the history) before doing this or you will have conflicting heads that you will need to resolve somehow.

Since Hexo themes are added as Git submodules (if done according to the Hexo documentation), using one required a change to the configuration file to checkout the submodules. The final configuration file up to this point looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
image: node:10.15.3

variables:
GIT_SUBMODULE_STRATEGY: recursive

cache:
paths:
- node_modules/

before_script:
- npm install hexo-cli -g
- test -e package.json && npm install
- hexo generate

pages:
script:
- hexo generate
artifacts:
paths:
- public
only:
- master

Committing this file should trigger a GitLab build. Once complete, the GitLab page should be able to be accessed from the name template described above (jhache.gitlab.io). If not, troubleshoot this issue before continuing.

Custom Domains

As explained above GitLab supports (multiple) custom domains to be routed to their pages. This is configured/set up in the Pages Settings of the GitLab repository.

Add a New Domain:

Once done, a screen will appear that must be taken to a Domain Registrar to configure the details of:

This must also be done for www domains since www is technically a subdomain.

I use NameCheap as my Domain Registrar so the remaining steps will use their Advanced DNS page for the domain. Other Domain Registrars may have a similar setup but I cannot guarantee this. GitLab does provide documentation from some Domain Registrars but NameCheap was not one of them. Fortunately, I was able to find one article that helped me configure my DNS Host Records:

The A record redirects jhache.me to GitLab pages. The first CNAME aliases jhache.gitlab.io as jhache.me. The second CNAME record aliases jhache.gitlab.io as www.jhache.me. The first TXT record is the verification code provided by the Pages Domain Details for jhache.me. The second TXT record is the verification code provided by the Pages Domain Details for www.jhache.me.

Since the Pages Domain Details contain the raw TXT record that should be used, the NameCheap user interface contains a substring of the value. The TXT records should be left so that certbot can verify ownership of the domain when certificates are regenerated - in case the domain name is transferred.

Once your domain has been verified, leave the verification record in place: your domain will be periodically reverified, and may be disabled if the record is removed.

Depending on the TTL settings for the host record, this could take some time to propagate. You can use the dig command or Online dig to check the DNS records associated with a domain.

Back in the Pages Domain Details refresh the verification until it is green. Once verified, wait for a certificate to be generated - the Pages domain settings will look like this when it has been acquired:

Once acquired, the Force HTTPS setting can be set. With that requests to the configured domain should redirect to GitLab pages over HTTPS.

Outstanding Issues

Because GitLab CI/CD checks out themes as part of the build process, my theme configuration settings have been lost. I have an idea to resolve this by modifying the .gitlab-ci.yml to copy the desired _config.yml from a directory in the main repository, but I have not verified if this will work.

Prism Module InitializationMode Comparison

Introduction

As part of my self-improvement challenge, I have been watching the Introduction to Prism course from Pluralsight. I chose this course so I am better equipped for my team’s Prism application project at work where I was recently tasked to improve the startup performance.

At this time, the project contains sixty-nine IModule implementation types; however, that number is continuing to grow. All of these modules will not be loaded at once and some of them may not be used/loaded at all. Some of them are conditionally loaded during runtime when certain criteria are met.

While watching the Initializing Modules video I found myself wondering if anything would change if I were to change these conditionally loaded modules InitializationMode from the default WhenReady to OnDemand. My reasoning behind this is because Brian Lagunas explains that WhenReady initializes modules as soon as possible or OnDemand when the application needs them in the video. Brian recommends using OnDemand if the module is not required to run, is not always used, and/or is rarely used.

I have a few concerns:

  1. Impacting features because the module is not loaded beforehand or the Module initialization is not done manually.
  2. No performance impact because this project handles module initialization itself to parallelize it instead of letting Prism manage it.

In the end, only benchmarking each option will provide information to make a decision. To do this I used JetBrains dotTrace, focusing on the timings for App.OnStartup, Bootstrapper.Run, Bootstrapper.ConfigureModuleCatalog, and Bootstrapper.InitializeModules. Since we try to load modules in parallel, I ended up adding the timing for this as well - otherwise, the timing may have appeared off.

Baseline - InitializationMode.WhenAvailable

The first step was to gather baseline metrics.

Profile #1Profile #2Profile #3Profile #4Profile #5MinAverageMedianMaxSTD
App.OnStartup584546874220454549734220485446875845551.6462635
Bootstrapper.Run5954398625983293277925983722329359541215.581013
Bootstrapper.ConfigureModuleCatalog11487673635111.51.5558.15111148385.1511911
Bootstrapper.InitializeModules184109117857171113.210918439.0404918
Asynchronous Module Initialization182122332311257125641821230023112571274.6590614

Not terrible, but not ideal. The application splash screen is displayed for about 4.5 seconds on average on a developer machine with only a few conditional modules enabled.

InitializationMode.OnDemand

With the baseline determined, a comparison can be made when switching the modules to be loaded OnDemand.

Profile #1Profile #2Profile #3Profile #4Profile #5MinAverageMedianMaxSTD
App.OnStartup5419396943915919549039695037.654195919733.0750575
Bootstrapper.Run2770219720172086223820172261.621972770266.0320281
Bootstrapper.ConfigureModuleCatalog408374340352388340372.437440824.40983408
Bootstrapper.InitializeModules143676969666682.86914330.1224169
Asynchronous Module Initialization1926163916991603163216031699.816391926117.3292802

All the Bootstrapper methods seemed to have improved, but overall the App.OnStartup took approximately the same amount of time.

Summary

There was an impact, but not in the overall startup time - which I find a little peculiar. It seems as though the overhead may have been shifted elsewhere in the startup process.

This may mean a hybrid approach to Bootstrapper.InitializeModules does have merits although not as much as I had hoped. Another option may be to change the Bootstrapper.ConfigureModuleCatalog to conditionally determine to add modules instead of applying a ‘safe’ default. Or perhaps I am diagnosing the wrong problem and should at other options - such as switching Dependency Injection frameworks.

In any case, I am going to discuss this as an option with my team - and see if additional testing can be done with more conditional modules enabled.

Increasing Productivity by Beating Procrastination Review

Today, I decided I was going to challenge myself to write a blog post and/or watch a Pluralsight module/course every day. I have been feeling stagnant lately and want to get back into improving myself. I cannot think of a better way to start that adventure than by writing a blog post about a Pluralsight course on productivity and overcoming procrastination.

Stephen Haunts put together the Increasing Productivity by Beating Procrastination course that was released on December 11, 2018.

One of the most significant threats to our productivity at work is procrastination and the difficulty in getting focused. This course will teach you how to understand procrastination and offer practical tips for beating the habit and getting focused.

The course has four main modules:

  1. What Is Procrastination?
  2. Understanding Procrastination
  3. Overcoming Procrastination
  4. Developing an Ability to Focus

What Is Procrastination?

In this module, Steven defines procrastination as:

The habit of putting off or delaying, especially something requiring immediate attention.

From my personal experience, this seems like an accurate definition.

He continues by outlining why he thinks we procrastinate:

  1. Fear of Failure
  2. Procrastinators are Perfectionists
  3. Low Energy Levels
  4. Lack of Focus

Fear of Failure and Perfectionism seem like the same thing. However, this is probably the biggest reason why I procrastinate. These are also the two that make absolutely the least amount of sense. “Failure” is the best way to learn - this is how all children learn. Somewhere while growing up this learning paradigm shifts into avoidance of failure.

“I have not failed. I’ve just found 10,000 ways that won’t work”

Low Energy Levels seem like they could have a contributing impact on productivity, particularly toward the beginning of the week. The phrase “a case of the Mondays” supports this:

symptoms of a useless or horrible Monday morning after returning from the weekend, used in the movie Office Space

Lack of Focus seems a little too open-ended considering the Attention Deficit Disorder society we live in. Everything is competing for focus at the same time and we only have a limited supply of it and willpower. Perhaps this is what the author means.

Understanding Procrastination

With Procrastination defined, the author continues with identifying where procrastination occurs to try to help train your awareness of it. With this there are some things we individually will need to accept to decrease the chances of procrastination occurring:

  1. Accepting we are not perfect
  2. Understanding failure is not fatal
  3. Aim to do your best and be happy about the output
  4. Try to develop a healthier lifestyle to get more energy
  5. Go to bed earlier
  6. Reduce screen time before bedtime

The nature of the first few seems very Zen/Stoic.

Zen:

An approach to an activity, skill, or subject that emphasizes simplicity and intuition rather than conventional thinking or fixation on goals.

Stoic:

of or pertaining to the school of philosophy founded by Zeno, who taught that people should be free from passion, unmoved by joy or grief, and submit without complaint to unavoidable necessity.

The remaining ones seem related to each other but do play an important part in our lifestyles and self-improvement.

The overall message seems to be “do the best possible, reflect, and improve”.

Overcoming Procrastination

Stephen provides the following options to get rid of obstacles that lead to procrastination:

  1. Avoid the distraction (move away from the distraction)
  2. Blocking the distraction (prevent the distraction from occurring)
  3. Satisfy the need (hunger)
  4. Confront the distraction (environmental noise)
  5. Just start the task

Out of all of these, Just Start the Task has been the biggest boon to my productivity. I find that within five minutes of starting a task, I have overcome my procrastination.

It is for this reason, I appreciate techniques/frameworks/guidelines such as the Pomodoro Technique, Kanban, and Getting Things Done. These tools are what I use as the foundation for the habits that the author encourages building. For Stephen, creating a habit should have the following guidelines:

  • A productive mindset
  • Set goals (measurable and prioritized)
  • Identify tasks that can be turned into habits
  • Put a place and time for the habit (define your habits environment)
  • Remind yourself of the goal

Summary

Overall, the course did not provide me with any new insights or tools to help me overcome procrastination. But realistically, should there be? David Allen, the creator of Getting Things Done admits in his book that he would not be teaching how to do anything new but would be providing the framework that utilizes all we know.

If nothing else, the course was a different perspective and reassurance that others suffer from procrastination. My biggest takeaway from the course will be “If you fail, forgive yourself, make adjustments, and try again.” Must be my perfectionism wanting to get it right the first time or not at all.

Project Euler #0001: Multiples of 3 and 5

Problem Description

If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6, and 9. The sum of these multiples is 23.

Find the sum of all the multiples of 3 or 5 below 1000.

Simple Solution

The simplest solution is to iterate over all numbers up to the limit. If any of these numbers is a multiple of one of the factors then it is included in the summation.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
public static ulong SumFactorMultiplesBelowLimit(int limit, params int[] factors)
{
if (factors == null
|| !factors.Any())
{
throw new System.ArgumentException("Invalid factors.", nameof(factors));
}

ulong sum = 0;

for (int i = 0; i < limit; i++)
{
foreach (int factor in factors)
{
if (i % factor == 0)
{
sum += (ulong)i;
break;
}
}
}

return sum;
}

Timing the operation yields:

Minimum Elapsed: 00:00:00.0000114 Average Elapsed: 00:00:00.0000545 Maximum Elapsed: 00:00:00.0004331

Pretty quick, but that is most likely because the problem space is small. If the limit is increased, or if more factors are introduced the number of operations performed is increased. In big-o notation, this approach is $$ O(n * m) $$.

Asynchronous Simple Solution

Note, this particular approach is not recommended for the problem as it is laid out in the description but is included to compare results and as a thought experiment. This solution could be viable if the number of factors used increased and there was a mechanism to reduce the amount of duplicated iterative work performed.

Another possible way to structure a solution to the problem is by giving each factor provided a thread to calculate the multiples up to the limit it has. Once each thread has been completed, the resulting multiples are compared for matching numbers that are used to generate a sum:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
public static async Task<int> SimpleSumFactorMultiplesBelowLimitAsync(int limit, params int[] factors)
{
if (factors == null
|| !factors.Any())
{
throw new System.ArgumentException("Invalid factors.", nameof(factors));
}

IList<Task<ICollection<int>>> taskCollection =
new List<Task<ICollection<int>>>();

foreach (int factor in factors)
{
taskCollection.Add(GetFactorMultiplesBelowLimitAsync(limit, factor));
}

await Task.WhenAll(taskCollection);

ICollection<int> factorMultiples =
new HashSet<int>(await taskCollection.First());

for (int i = 1; i < taskCollection.Count; i++)
{
ICollection<int> factorMultiplesResults = await taskCollection[i];
foreach (int factorMultiple in factorMultiplesResults)
{
factorMultiples.Add(factorMultiple);
}
}

return factorMultiples.Sum();
}

The iterative work for this solution was extracted to a helper method to parallelize it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
public static async Task<ICollection<int>> GetFactorMultiplesBelowLimitAsync(int limit, int factor)
{
ICollection<int> factorMultiples = new HashSet<int>();

for (int i = 0; i < limit; i++)
{
if (i % factor == 0)
{
factorMultiples.Add(i);
}
}

return factorMultiples;
}

This is not the prettiest solution by any stretch and yields slightly worse results than the Simple Solution:

Minimum Elapsed: 00:00:00.0000317 Average Elapsed: 00:00:00.0002025 Maximum Elapsed: 00:00:00.0017206

The results are not surprising because of how the work is performed. There are two or more loops iterating over all of the numbers - duplicating the number of operations and comparisons that must be done.

One possible improvement for this solution could be to have a single shared collection of possible numbers that would be updated to reduce the number of iterations that are performed by each thread instead of joining the results after they have all been completed. This could also introduce a race condition so a thread-safe data structure is recommended if this approach is taken.

Another possible improvement would be to start with the largest factor from the available factors so that the initial set of numbers is the smallest starting point that it could be to iterate over.

As stated before, this is not a recommended solution when the number of factors is small since the iterative work is duplicated. The only redeeming factor of this approach is that the work is done on multiple threads so if two threads are available at the same time the solution may be the same as the Synchronous Simple Solution. This can be seen in the Minimum Elapsed Time measurement being comparable to the Average Elapsed Time in the previous results.

Simple LINQ Solution

The Simple Solution can be converted into a more fluent LINQ syntax - at the cost of some performance:

1
2
3
4
5
6
7
8
9
10
11
12
public static ulong SimpleLinqSumFactorMultiplesBelowLimit(int limit, params int[] factors)
{
if (factors == null
|| !factors.Any())
{
throw new System.ArgumentException("Invalid factors.", nameof(factors));
}

return (ulong)Enumerable.Range(1, limit - 1)
.Where(number => factors.Any(factor => number % factor == 0))
.Sum();
}

The type cast is necessary to convert the Sum() operation to the return type. This could cause a little performance degradation but was not significant in the measurements.

This solution reads a little easier to read and possibly understand since it reads like a sentence.

The results of this solution are:

Minimum Elapsed: 00:00:00.0000574 Average Elapsed: 00:00:00.0001152 Maximum Elapsed: 00:00:00.0006285

As expected, this is slower than the simple solution but still fast. The tradeoff could be worth it for the improved readability.

Surprisingly, this solution is slower than the asynchronous solution in some scenarios - seen by comparing the Minimum Elapsed Time of the two results.

Algorithmic Solution

The simple solution satisfies the criteria to generate an answer, but the performance can be improved by looking for an algorithmic solution instead of a brute-force solution. Conveniently, the problem is asking for a solution to a Finite Arithmetic Progression, specifically an Arithmetic Series which has an algorithmic solution.

… a sequence of numbers such that the difference between the consecutive terms is constant. … The sum of the members of a finite arithmetic progression is called an arithmetic series. … [The] sum can be found quickly by taking the number n of terms being added, multiplying by the sum of the first and last number in the progression, and dividing by 2:

The equation for solving this kind of problem is:

$$\begin{equation} n(a_1+a_n)\over2 \end{equation}$$

In this equation:

  • $$ n $$ is the number of terms being added.
  • $$ a_1 $$ is the initial term.
  • $$ a_n $$ is the last number in the progression.

Using the value 3 from the problem description’s example in this manner yields the following progression:

$$ 3 + 6 + 9 $$

The $$ n $$ in the algorithm can be solved by taking the limit and dividing it by the starting term in the progression and discarding any remainder.

When substituting values into the equation it becomes the following:

$$\begin{eqnarray} 3(3 + 9)\over2 &=& 3(12)\over2 &=& 36\over2 &=& 18 \end{eqnarray}$$

With this algorithm, the sum for each specified multiple can be calculated. Keep in mind that all shared multiples for all factors must be subtracted. In the problem description this would be $$ 5 * 3 = 15 $$ if the limit was larger than the multiple (15 in this case).

Synchronous Algorithmic Solution

A synchronous solution for this problem with only two factors could look something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
public static async Task<int> AlgorithmicSumFactorMultiplesBelowLimit(int limit, params int[] factors)
{
if (factors == null
|| !factors.Any())
{
throw new System.ArgumentException("Invalid factors.", nameof(factors));
}

int sum = 0;
ICollection<int> factorLookup = new HashSet<int>(factors);
ICollection<int> multiples = new HashSet<int>();

for(int i = 0; i < factors.Length; i++)
{
int factor = factors[i];
sum += AlgorithmicSumFactorBelowLimit(limit, factor);

for (int j = i + 1; j < factors.Length; j++)
{
int multiple = factor * factors[j];

if (!factorLookup.Contains(multiple)
&& limit > multiple
&& !multiples.Contains(multiple))
{
multiples.Add(multiple);
sum -= AlgorithmicSumFactorBelowLimit(limit, multiple);
}
}
}

return sum;
}

The AlgorithmicSumFactorBelowLimit method looks like this:

1
2
3
4
5
6
7
8
private static int AlgorithmicSumFactorBelowLimit(int limit, int factor)
{
int n = (limit - 1) / factor;
int a1 = factor;
int an = n * a1;

return n * (a1 + an) / 2;
}

The limit is subtracted by one when calculating $$ n $$ so that factors that evenly divide the limit do not generate an off-by-one error.

The performance of this solution is:

Minimum Elapsed: 00:00:00.0000007 Average Elapsed: 00:00:00.0000011 Maximum Elapsed: 00:00:00.0000045

Initially, I had the algorithmic method asynchronous to share code but wanted to ensure there was not any skewing of results that may have occurred from .GetAwaiter().GetResult(). Spoiler alert, the results were approximately the same in both - meaning there probably would not have been any perceptible difference in the results.

Asynchronous Algorithmic Solution

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
public static async Task<int> AlgorithmicSumFactorMultiplesBelowLimitAsync(int limit, params int[] factors)
{
if (factors == null
|| !factors.Any())
{
throw new System.ArgumentException("Invalid factors.", nameof(factors));
}

int sum = 0;
ICollection<int> factorLookup = new HashSet<int>(factors);
ICollection<int> multiples = new HashSet<int>();

for (int i = 0; i < factors.Length; i++)
{
int factor = factors[i];
sum += AlgorithmicSumFactorBelowLimit(limit, factor);

for (int j = i + 1; j < factors.Length; j++)
{
int multiple = factor * factors[j];

if (!factorLookup.Contains(multiple)
&& limit > multiple
&& !multiples.Contains(multiple))
{
multiples.Add(multiple);
sum -= AlgorithmicSumFactorBelowLimit(limit, multiple);
}
}
}

return sum;
}

And updating the algorithm to be asynchronous as well:

1
2
3
4
5
6
7
8
private static async Task<int> AlgorithmicSumFactorBelowLimitAsync(int limit, int factor)
{
int n = (limit - 1) / factor;
int a1 = factor;
int an = n * a1;

return n * (a1 + an) / 2;
}

Measuring the performance of this solution generated:

Minimum Elapsed: 00:00:00.0000006 Average Elapsed: 00:00:00.0000010 Maximum Elapsed: 00:00:00.0000022

In this case, the asynchronous solution impacts the performance results positively because each thread can contribute to solving the problem without duplicating any of the work.

Because each thread can do work in isolation, this solution will scale well even as the number of factors increases - as long as there are threads available to do the processing work.

DryIoc - Dependency Injection via Reflection

Introduction

My work inherited an ASP.NET WebApi Project from a contracted company. One of the first things added to the project was a Dependency Injection framework. DryIoc was selected for its speed in resolving dependencies.

In this post, I show why and how reflection was utilized to improve the Dependency Injection Registration code.

Architecture

First, let me provide a high-level overview of the architecture (using sample class names). The project is layered in the following way:

Controllers

Controllers are the entry-point for the code (just like all WebApi applications) and are structured something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
[RoutePrefix("posts")]
public class BlogPostController : ProjectControllerBase
{
private readonly IBlogPostManager blogPostManager;

public BlogPostController(ILogger logger, IBlogPostManager blogPostManager)
: base(logger)
{
this.blogPostManager = blogPostManager;
}

[HttpGet]
[Route("{blogPostId}")]
[ResponseType(typeof(BlogPost))]
public async Task<IHttpActionResult> GetBlogPost(int blogPostId)
{
// Logging logic to see what was provided to the Controller method.

if (blogPostId <= default(int))
{
return this.BadRequest("Invalid Blog Post Id.");
}

BlogPost blogPost = await this.blogPostManager.GetBlogPost(blogPostId);

if (blogPost == null)
{
return this.NotFound();
}

return this.Ok(blogPost);
}

// ... Additional Service Methods
}

Controllers are responsible for ensuring sane input values are provided before passing the parameters through to the Manager layer. In doing so, the business logic is pushed downwards allowing for more re-use.

The ProjectControllerBase provides instrumentation logging and as a catch-all for any errors that may occur during execution:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
public abstract class ProjectControllerBase : ApiController
{
public ProjectControllerBase(ILogger logger)
{
this.Logger = logger;
}

protected ILogger Logger { get; }

public override async Task<HttpResponseMessage> ExecuteAsync(
HttpControllerContext controllerContext,
CancellationToken cancellationToken)
{
HttpRequestMessage requestMessage = controllerContext.Request;

// Logging request details

Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();

try
{
return await base.ExecuteAsync(controllerContext, cancellationToken);
}
catch(ProjectException projectException)
{
// Contracting developers decided to throw exceptions for control-flow - this handles this case.

return requestMessage.CreateErrorResponse(HttpStatusCode.InternalServerError, projectException.Message);
}
catch(Exception exception)
{
// Log the exception.

return requestMessage.CreateErrorResponse(HttpStatusCode.InternalServerError, exception.Message);
}
finally
{
stopwatch.Stop();

// Log the time.
}
}
}

My goal is to refactor this at some point to remove the ILogger dependency.

Managers

Managers perform more refined validation and contain the business logic for the application. This allows for the business logic to be referenced by a variety of front-end applications (website, API, desktop application) easily.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
public class BlogPostManager : IBlogPostManager
{
public readonly IBlogPostRepository blogPostRepository;

public BlogPostManager(IBlogPostRepository blogPostRepository)
{
this.blogPostRepository = blogPostRepository;
}

public async Task<BlogPost> GetBlogPost(int blogPostId)
{
// Logging logic to see what was provided to the Manager method.

// Any additional validation logic can be performed here - such as ensuring the blog post exists.

BlogPostEntity blogPostEntity = await this.blogPostRepository.GetBlogPost(blogPostId);

// Any additional validation logic can be performed here - such as ensuring the blog post is not in Draft status.

if (blogPostEntity == null)
{
// Throw an exception (NullReference), or return null.
return null;
}

return new BlogPost()
{
// Mapping from BlogPostEntity to BlogPost model.
};
}

// ... Additional functionality
}

Each Manager is coded to an interface.

1
2
3
4
5
6
public interface IBlogPostManager
{
Task<BlogPost> GetBlogPost(int blogPostId);

// ... Additional functionality
}

By doing this, the Liskov Substitution Principle can be applied; allowing for flexible and isolated unit tests.

Repositories

Repositories act as the data access component for the project.

Initially, Entity Framework was used exclusively for data access. However; for read operations, Entity Framework is being phased for Dapper due to performance issues.

Entity Framework automatically uses an ORDER BY clause to ensure results are grouped. In some cases, this caused queries to time out. Often, this is a sign that the data model needs to be improved and/or that the SQL queries were too large (joining too many tables).

Additionally, our Database Administrators wanted read operations to use WITH (NOLOCK).

To the best of our knowledge, a QueryInterceptor would need to be used. This seemed to be counter-intuitive and our aggressive timeline would not allow for any time to tweak and experiment with the Entity Framework code.

For insert operations, Entity Framework is preferred.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
public class BlogPostRepository : IBlogPostRepository
{
private readonly BlogEntities blogEntities;
private readonly string databaseConnectionString;

public BlogPostRepository(BlogEntities blogEntities)
{
this.blogEntities = blogEntities;
this.databaseConnectionString = blogEntities.Database.ConnectionString;;
}

public async Task<BlogPostEntity> GetBlogPost(int blogPostId)
{
// Logging logic to see what was provided to the Repository method.

DynamicParameters sqlParameters = new DynamicParameters();
sqlParameters.Add(nameof(blogPostId), blogPostId);

StringBuilder sqlBuilder = new StringBuilder()
.AppendFormat(
@"SELECT
* -- Wildcard would not be used in actual code.
FROM blog_posts WITH (NOLOCK)
WHERE
blog_posts.blog_post_id = @{0}", nameof(blogPostId));

using (SqlConnection sqlConnection = new SqlConnection(this.databaseConnectionString))
{
await sqlConnection.OpenAsync();

// Logging logic to time the query.
BlogPostEntity blogPostEntity =
await sqlConnection.QueryFirstOrDefaultAsync(
sqlBuilder.ToString(),
sqlParameters);

return blogPostEntity;
}
}
}

Each Repository is coded to an interface.

1
2
3
4
5
6
public interface IBlogPostRepository
{
Task<BlogPostEntity> GetBlogPost(int blogPostId);

// ... Additional functionality
}

By doing this, the Liskov Substitution Principle can be applied; allowing for flexible and isolated unit tests.

DryIoc

DryIoc is fast, small, full-featured IoC Container for .NET

Registration

The Dependency Injection framework is registered during application start-up with OWIN:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
public class StartUp
{
public void Configuration(IAppBuilder appBuilder)
{
HttpConfiguration httpConfiguration = GlobalConfiguration.Configuration;

// ... Additional Set Up Configuration

DependencyInjectionConfiguration.Register(httpConfiguration);

// ... Additional Set Up Configuration

httpConfiguration.EnsureInitialized();

// ... Additional Start Up Configuration
}
}

The DependencyInjectionConfiguration class registers the container for the application to resolve dependencies using the following code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
public static class DependencyInjectionConfiguration
{
public static void Register(HttpConfiguration httpConfiguration)
{
IContainer container = new Container().WithWebApi(httpConfiguration);

// ... Additional Registrations

DependencyInjectionConfiguration.RegisterEntityFrameworkContexts(container);

// ... Additional Registrations

DependencyInjectionConfiguration.RegisterManagers(container);
DependencyInjectionConfiguration.RegisterRepositories(container);

container.VerifyResolutions();
}

private static T CreateDbContext<T>()
where T : DbContext, new()
{
T context = new T();

context.Configuration.LazyLoadingEnabled = false;

// ... Set Up Database Logging: context.Database.Log = a => <logging mechanism>;

return context;
}

private static void RegisterEntityFrameworkContexts(IContainer)
{
container.Register<BlogEntities>(Reuse.InWebRequest, Made.Of(() => CreateDbContext<BlogEntities>()));
}

private static void RegisterManagers(IContainer)
{
// ... Additional Managers

container.Register<IBlogPostManager, BlogPostManager>(Reuse.InWebRequest);

// ... Additional Managers
}

private static void RegisterRepositories(IContainer)
{
// ... Additional Repositories

container.Register<IBlogPostRepository, BlogPostRepository>(Reuse.InWebRequest);

// ... Additional Repositories
}
}

Problems with this would occasionally arise when a developer introduced new Manager or Repository classes but did not remember to register instances of those classes with the Dependency Injection container. When this occurred, the compilation and deployment would succeed; but the following runtime error would be thrown when the required dependencies could not be resolved:

An error occurred when trying to create a controller of type ‘BlogPostController’. Make sure that the controller has a parameterless public constructor.

The generated error message does not help identify the underlying issue.

To prevent this from occurring, all Manager and Repository classes would need to automatically register themselves during start-up.

Reflection

To automatically register classes, reflection can be utilized to iterate over the assembly types and register all Manager and Repository implementations. Initially, this was done by loading the assembly containing the types directly from the disk:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
public static class DependencyInjectionConfiguration
{
public static void Register(HttpConfiguration httpConfiguration)
{
IContainer container = new Container().WithWebApi(httpConfiguration);

// ... Additional Registrations

DependencyInjectionConfiguration.RegisterEntityFrameworkContexts(container);

// ... Additional Registrations

DependencyInjectionConfiguration.RegisterManagersAndRepositories(container);

// ... Additional Registrations

container.VerifyResolutions();
}

// ... Additional Functions

private static bool IsInterfaceOrAbstractClass(Type exportedType)
{
return exportedType.IsInterface
|| exportedType.IsAbstract;
}


private static bool IsNotManager(Type exportedType)
{
return !exportedType.Name.EndsWith("Manager", StringComparison.InvariantCultureIgnoreCase);
}

private static bool IsNotRepository(Type exportedType)
{
return !exportedType.Name.EndsWith("Repository", StringComparison.InvariantCultureIgnoreCase);
}

// ... Additional Functions

private static void RegisterDependencies(IContainer container)
{
string assemblyPath = HttpContext.Current.Server.MapPath("~/bin/Dependencies.dll");
Assembly dependencyAssembly = Assembly.LoadFrom(assemblyPath);

foreach (Type exportedType in dependencyAssembly.GetExportedTypes())
{
// Skip registering items that are an interface or abstract class since it is
// not known if there is an implementation defined in this assembly.
if (DependencyInjectionConfiguration.IsInterfaceOrAbstractClass(exportedType))
{
continue;
}

// Skip registering items that are not a Manager, or Repository.
if (DependencyInjectionConfiguration.IsNotManager(exportedType)
&& DependencyInjectionConfiguration.IsNotRepository(exportedType))
{
continue;
}

string interfaceName = $"I{exportedType.Name}";
Type[] interfaceTypes = exportedType.GetInterfaces();

Type serviceType =
interfaceTypes.FirstOrDefault(
interfaceType =>
interfaceType.Name.Equals(interfaceName, StringComparison.InvariantCultureIgnoreCase))
?? exportedType;

container.Register(
serviceType,
exportedType,
Reuse.InWebRequest,
ifAlreadyRegistered: IfAlreadyRegistered.Keep);
}
}

// ... Additional Functions
}

While this works, it felt wrong to load the assembly from disk using a hard-coded path; especially when the assembly will be loaded by the framework automatically. To account for this, the code was modified in the following manner:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
public static class DependencyInjectionConfiguration
{
public static void Register(HttpConfiguration httpConfiguration)
{
IContainer container = new Container().WithWebApi(httpConfiguration);

// ... Additional Registrations

DependencyInjectionConfiguration.RegisterEntityFrameworkContexts(container);

// ... Additional Registrations

DependencyInjectionConfiguration.RegisterManagersAndRepositories(container);

// ... Additional Registrations

container.VerifyResolutions();
}

// ... Additional Functions

private static bool IsInterfaceOrAbstractClass(Type exportedType)
{
return exportedType.IsInterface
|| exportedType.IsAbstract;
}


private static bool IsNotManager(Type exportedType)
{
return !exportedType.Name.EndsWith("Manager", StringComparison.InvariantCultureIgnoreCase);
}

private static bool IsNotRepository(Type exportedType)
{
return !exportedType.Name.EndsWith("Repository", StringComparison.InvariantCultureIgnoreCase);
}

// ... Additional Functions

private static void RegisterDependencies(IContainer container)
{
AssemblyName dependencyAssemblyName = Assembly.GetExecutingAssembly()
.GetReferencedAssemblies()
.FirstOrDefault(referencedAssembly => referencedAssembly.Name.Equals("Dependencies"));
Assembly dependencyAssembly = Assembly.Load(dependencyAssemblyName);

foreach (Type exportedType in dependencyAssembly.GetExportedTypes())
{
// Skip registering items that are an interface or abstract class since it is
// not known if there is an implementation defined in this assembly.
if (DependencyInjectionConfiguration.IsInterfaceOrAbstractClass(exportedType))
{
continue;
}

// Skip registering items that are not a Manager, or Repository.
if (DependencyInjectionConfiguration.IsNotManager(exportedType)
&& DependencyInjectionConfiguration.IsNotRepository(exportedType))
{
continue;
}

string interfaceName = $"I{exportedType.Name}";
Type[] interfaceTypes = exportedType.GetInterfaces();

Type serviceType =
interfaceTypes.FirstOrDefault(
interfaceType =>
interfaceType.Name.Equals(interfaceName, StringComparison.InvariantCultureIgnoreCase))
?? exportedType;

container.Register(
serviceType,
exportedType,
Reuse.InWebRequest,
ifAlreadyRegistered: IfAlreadyRegistered.Keep);
}
}

// ... Additional Functions
}

Unfortunately, there are no timing metrics available for measuring if there are any performance improvements for either implementation. With that said, the second implementation seems faster. This may be because the assembly is already loaded due to other registrations that occur before the reflection registration code is executed. For this reason, results may vary from project to project.

Overall, the solution works well and has limited the runtime error appearing only when a new Entity Framework context is added to the project.

Terraria Corrupt Save Recovery

Introduction

One of my brothers gifted Terraria to me on Steam. Needless to say, instead of doing the things that I ought to be doing; I have been instead playing it far more frequently. At least until my world save became corrupted.

While I was playing, Terraria suddenly crashed. I figure it was in the middle of a save/backup because when I tried to load the game back up Terraria informed me that my save was corrupted (I am not sure what caused it either). I shrugged it off thinking “No big deal, I will just use the backup file.” As it turned out, the backup was not as recent as I would have liked.

I had just gotten through a particularly nasty section of sand and had no desire to repeat it. Logically, the next thing to try was to open the world save in TEdit to see if there was anything salvageable. To my surprise, I opened the world save with TEdit and was informed that TEdit could try to recover it. Cool! After it loaded the map, nothing seemed out of place, so I saved it and went on with my business of exploring.

It was not until I had an inventory full of items that I realized what TEdit had not been able to recover - chest data. All the chests in my player home were now empty. Curious, I checked other chests using TEdit. All of the chests I checked were empty too! I would later find out that about 100 of the chests were now empty.

That would not do, that was about a third of the chests in my world. Half the fun of exploring and finding a chest is the goodies you get inside it. The only option I could think of at the time was to create a new world, maybe even using the same world seed to get the same map. However, I was not sure (still am not sure either) whether chest data is randomly generated, which could mean that even using the same world seed would not result in the same items. To top it off, it would have been easier to go through the section of sand that I was trying to avoid. So the only real option was to see if I could repair the world save to get the chest items back.

Fortunately, I had been backing my save files up to Google Drive. Without these backups, there would have been no way to restore the chest data. That does not mean it was a breeze though, there were a few hiccups along the way; most of them involving the sheer amount of data included in the world save. In the end, I was able to restore my game save to a state that I consider to be close enough to where it was that I do not miss anything. Probably, I am still missing a few items, but nothing I have noticed so I do not feel like I lost anything.

Researching the Terraria Source Code

The first step of the process required parsing world save data. I had to find the chest data in the good backup save to get the list of items that I was missing. Luckily, the Terraria client is fairly easy to decompile (NOTE: The decompiled source code will not run without modifications). There could be legal implications for decompiling the Terraria client - I do not know, I did not read the End User License Agreement. Instead, I used a repository that someone else had posted from dotPeek. Using the decompiled source code I could replicate how the game client reads the world save data and compare the chests from two of my world saves.

The code I was looking for is located in Terraria/IO/WorldFile.cs and begins in loadWorld. loadWorld does some date checking for special events, checks if the file exists, and then reads some data to determine how to parse the rest of the data. Depending on this value, the code is directed to either WorldFile.LoadWorld_Version1 or WorldFile.LoadWorld_Version2, since I know that my world file is fairly recent I immediately continued to WorldFile.LoadWorld_Version2.

WorldFile.LoadWorld_Version2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
public static int LoadWorld_Version2(BinaryReader reader)
{
reader.BaseStream.Position = 0L;
bool[] importance;
int[] positions;
if (!WorldFile
.LoadFileFormatHeader(reader, out importance, out positions)
|| reader.BaseStream.Position != (long) positions[0])
return 5;
WorldFile.LoadHeader(reader);
if (reader.BaseStream.Position != (long) positions[1])
return 5;
WorldFile.LoadWorldTiles(reader, importance);
if (reader.BaseStream.Position != (long) positions[2])
return 5;
WorldFile.LoadChests(reader);
if (reader.BaseStream.Position != (long) positions[3])
return 5;
WorldFile.LoadSigns(reader);
if (reader.BaseStream.Position != (long) positions[4])
return 5;
WorldFile.LoadNPCs(reader);
if (reader.BaseStream.Position != (long) positions[5])
return 5;
if (WorldFile.versionNumber >= 116)
{
if (WorldFile.versionNumber < 122)
{
WorldFile.LoadDummies(reader);
if (reader.BaseStream.Position != (long) positions[6])
return 5;
}
else
{
WorldFile.LoadTileEntities(reader);
if (reader.BaseStream.Position != (long) positions[6])
return 5;
}
}
if (WorldFile.versionNumber >= 170)
{
WorldFile.LoadWeightedPressurePlates(reader);
if (reader.BaseStream.Position != (long) positions[7])
return 5;
}
if (WorldFile.versionNumber >= 189)
{
WorldFile.LoadTownManager(reader);
if (reader.BaseStream.Position != (long) positions[8])
return 5;
}
return WorldFile.LoadFooter(reader);
}

The WorldFile.LoadWorld_Version2 function provided me with a layout of the different sections in a world save. The data appeared to be broken out into the following sections which are read sequentially from the world save:

  • File Format Header
  • Header
  • World Tiles
  • Chests
  • Signs
  • NPCs
  • Tile Entities
  • Weighted Pressure Plates
  • Town Manager
  • Footer

Wow! That is more data than I thought there would be. It appears as though after every section a check is done to verify that the current position in the data file matches a position that is read from the WorldFile.LoadFileFormatHeader. I should be able to use the position at the corresponding index to jump directly to the Chest data section.

WorldFile.LoadFileFormatHeader

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
private static bool LoadFileFormatHeader(
BinaryReader reader,
out bool[] importance,
out int[] positions)
{
importance = (bool[]) null;
positions = (int[]) null;
if ((WorldFile.versionNumber = reader.ReadInt32()) >= 135)
{
try
{
Main.WorldFileMetadata =
FileMetadata.Read(reader, FileType.World);
}
catch (FileFormatException ex)
{
Console.WriteLine(
Language.GetTextValue("Error.UnableToLoadWorld"));
Console.WriteLine((object) ex);
return false;
}
}
else
Main.WorldFileMetadata =
FileMetadata.FromCurrentSettings(FileType.World);
short num1 = reader.ReadInt16();
positions = new int[(int) num1];
for (int index = 0; index < (int) num1; ++index)
positions[index] = reader.ReadInt32();
short num2 = reader.ReadInt16();
importance = new bool[(int) num2];
byte num3 = 0;
byte num4 = 128;
for (int index = 0; index < (int) num2; ++index)
{
if ((int) num4 == 128)
{
num3 = reader.ReadByte();
num4 = (byte) 1;
}
else
num4 <<= 1;
if (((int) num3 & (int) num4) == (int) num4)
importance[index] = true;
}
return true;
}

The WorldFile.LoadFileFormatHeader is responsible for reading the WorldFileMetadata, the sections, and an array of items known as ‘importance’. Each section position is represented as a single integer value. I am familiar with this kind of data storage technique since HTTP packets do something similar:

IPv6 Header

Using the section list from the WorldFile.LoadWorld_Version2 and the position data from the WorldFile.LoadFileFormatHeader, I could read the Chest data immediately by jumping to that position in the save file. The next question was to determine how the Chest data was stored.

WorldFile.LoadChests

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
private static void LoadChests(BinaryReader reader)
{
int num1 = (int) reader.ReadInt16();
int num2 = (int) reader.ReadInt16();
int num3;
int num4;
if (num2 < 40)
{
num3 = num2;
num4 = 0;
}
else
{
num3 = 40;
num4 = num2 - 40;
}
int index1;
for (index1 = 0; index1 < num1; ++index1)
{
Chest chest = new Chest(false);
chest.x = reader.ReadInt32();
chest.y = reader.ReadInt32();
chest.name = reader.ReadString();
for (int index2 = 0; index2 < num3; ++index2)
{
short num5 = reader.ReadInt16();
Item obj = new Item();
if ((int) num5 > 0)
{
obj.netDefaults(reader.ReadInt32());
obj.stack = (int) num5;
obj.Prefix((int) reader.ReadByte());
}
else if ((int) num5 < 0)
{
obj.netDefaults(reader.ReadInt32());
obj.Prefix((int) reader.ReadByte());
obj.stack = 1;
}
chest.item[index2] = obj;
}
for (int index2 = 0; index2 < num4; ++index2)
{
if ((int) reader.ReadInt16() > 0)
{
reader.ReadInt32();
int num5 = (int) reader.ReadByte();
}
}
Main.chest[index1] = chest;
}
List<Point16> point16List = new List<Point16>();
for (int index2 = 0; index2 < index1; ++index2)
{
if (Main.chest[index2] != null)
{
Point16 point16 =
new Point16(
Main.chest[index2].x, Main.chest[index2].y);
if (point16List.Contains(point16))
Main.chest[index2] = (Chest) null;
else
point16List.Add(point16);
}
}
for (; index1 < 1000; ++index1)
Main.chest[index1] = (Chest) null;
if (WorldFile.versionNumber >= 115)
return;
WorldFile.FixDresserChests();
}

Not quite as straight-forward to decipher, but still doable:

  • num1 represents the number of chests stored in the world.
  • num2 represents the number of items stored in the chest.
  • num3 represents the items the chest is holding.
  • num4 represents the overflow of items if the chest had more items than the maximum.

Each chest then has the following properties:

  • x represents the x-coordinate of the chest in the world.
  • y represents the y-coordinate of the chest in the world.
  • name represents the name of the chest in the world.

I found it odd that the Chest data did not have an Id property that could be used to identify specific chests. Though the x and y properties are probably sufficient in most cases, it just meant that I would have to be more careful about identifying chests.

Depending on the next value a chest can have between 0 and 40 items that have the following properties:

  • id represents the item id
  • stack represents the quantity of that item in the single item slot.
  • prefix represents a prefix value that affects the stats on the item.

The code deviates a little from the normal layout with items. Instead of having an Int16 represent the number of items in a chest, each slot is read. If the slot is empty a zero will be placed in that Item data location and the code must account for that. This is caused by items in chests being located anywhere within the 40 slots. By storing the data this way, the game can reduce the size of the world save file.

After reviewing WorldFile.LoadChests, I had enough information to parse the chest data.

Applying the Research

In an effort NOT to confuse myself, I tried to break the sections out a little further into methods. The code was written using LINQPad to rapidly develop the prototype for reading the data. Main is the entry point for the ‘script’. The final code can be found in Appendix A.

1
2
3
4
5
6
7
8
9
10
11
12
13
void Main()
{
World GoodWorld = GetWorld(@"201708310826.wld");
}

World GetWorld(string worldPath)
{
using (BinaryReader worldReader =
new BinaryReader(File.OpenRead(worldPath)))
{
return GetWorld(worldReader);
}
}

GetWorld takes a string file path where the world file is located. This would allow me to add a single line to get the data from BadWorld once I was satisfied that the current implementation worked as intended. Once the file was opened for reading, the data needed to be read:

1
2
3
4
5
6
7
8
9
10
11
12
13
World GetWorld(BinaryReader worldReader)
{
World world = new World();

GetWorldMetaData(worldReader, world);
GetWorldSections(worldReader, world);

worldReader.BaseStream.Position = world.SectionSize[2];

GetChestData(worldReader, world);

return world;
}

The File MetaData is read to get the BinaryReader to the correct position. This could be done with math - 2x Int32 (8 bytes), 2x Int64 (16 bytes) would yield 24 bytes. However, to maintain readability instead of jumping to an arbitrary location, reading the MetaData seemed more appropriate. Especially since there is an unknown value and the structure of this location in the file could be updated or changed between versions. An exception to this ‘rule’ is made once the section data has been read since it is more obvious why and where the jump is occurring:

1
2
3
4
5
6
7
8
void GetWorldMetaData(BinaryReader worldReader, World world)
{
world.Version = worldReader.ReadInt32();
world.TypeCheck = worldReader.ReadInt64();
world.Revision = worldReader.ReadInt32();
// Unknown
world.UnknownMetaData = worldReader.ReadInt64();
}

Once the MetaData has been read, the Sections can be read:

1
2
3
4
5
6
7
8
9
void GetWorldSections(BinaryReader worldReader, World world)
{
world.SectionCount = worldReader.ReadInt16();

for(int section = 0; section < world.SectionCount; section++)
{
world.SectionSize[section] = worldReader.ReadInt32();
}
}

After jumping to the corresponding location in the BinaryReader, the Chest data is read:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
void GetChestData(BinaryReader worldReader, World world)
{
world.TotalChests = worldReader.ReadInt16();
world.MaxItems = worldReader.ReadInt16();

int itemsPerChest = world.MaxItems;
int overflowItems = 0;

if (world.MaxItems > 40)
{
itemsPerChest = 40;
overflowItems = world.MaxItems - 40;
}

for (int i = 0; i < world.TotalChests; i++)
{
world.ChestCollection[i] =
GetChestData(worldReader, itemsPerChest, overflowItems);
}
}

Chest GetChestData(
BinaryReader worldReader,
int itemsPerChest,
int overflowItems)
{
Chest chest = new Chest(itemsPerChest)
{
X = worldReader.ReadInt32(),
Y = worldReader.ReadInt32(),
Name = worldReader.ReadString(),
};

for (int i = 0; i < itemsPerChest; i++)
{
chest.ItemCollection[i] = GetItemData(worldReader);
}

for (int i = 0; i < overflowItems; i++)
{
GetItemData(worldReader);
}

return chest;
}

Each chest can contain one or more items up to the max item. For each chest, the items are parsed in a separate method:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Item GetItemData(BinaryReader worldReader)
{
short stackSize = worldReader.ReadInt16();

if (stackSize <= 0)
{
return null;
}

return new Item()
{
StackSize = stackSize,
Id = worldReader.ReadInt32(),
Prefix = worldReader.ReadByte(),
};
}

This yields the chest data in the world. Add a line to parse the BadWorld and verify that there are discrepancies between the two:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
void Main()
{
World BadWorld = GetWorld(@"201709010835.wld");
Console.WriteLine(
"{0} Empty Chests in Bad World out of {1}",
BadWorld.ChestCollection
.Where(chest =>
chest.ItemCollection
.All(item =>
item == null))
.Count(),
BadWorld.ChestCollection.Length);

MergeEmptyChests(
BadWorld.ChestCollection
.Where(chest =>
chest.ItemCollection
.All(item =>
item == null))
.ToArray(),
GoodWorld.ChestCollection);
}

Which yields the following output:

13 Empty Chests in Good World out of 302 145 Empty Chests in Bad World out of 302

That seems to match what I was seeing. Although I was a little surprised there were empty chests in the good world. It turned out that the empty chests in the good world came from the Hell layer and one in my player home that I was not utilizing.

The next step is to try to compare the chests in to the two worlds to see if they can be matched. If they can, a repair is possible.

Comparing Chest Data

Unfortunately, there is no way to prevent having to iterate two lists in order to compare. However, the size of the lists can be reduced by retrieving the empty chests from the bad world. It can be reduced even further by excluding chests contained within player housing. I recommend doing this because these chests have been touched by player(s) which I was more comfortable doing by hand with TEdit. It probably would not have made any difference though:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
void Main()
{
World BadWorld = GetWorld(@"201709010835.wld");
Console.WriteLine(
"{0} Empty Chests in Bad World out of {1}",
BadWorld.ChestCollection
.Where(chest =>
chest.ItemCollection
.All(item =>
item == null))
.Count(),
BadWorld.ChestCollection.Length);

MergeEmptyChests(
BadWorld.ChestCollection
.Where(chest =>
chest.ItemCollection
.All(item =>
item == null))
.ToArray(),
GoodWorld.ChestCollection);

MergeEmptyChests(
GoodWorld.ChestCollection,
BadWorld.ChestCollection
.Where(chest =>
!(chest.X > 3100
&& chest.X < 3125
&& chest.Y > 335
&& chest.Y < 340)
&& chest.ItemCollection.All(
item => item == null))
.ToArray());

Console.WriteLine();
Console.WriteLine("After Merge:");
Console.WriteLine(
"{0} Empty Chests in Bad World out of {1}",
BadWorld.ChestCollection
.Where(chest =>
chest.ItemCollection
.All(item =>
item == null))
.Count(),
BadWorld.ChestCollection.Length);
}

Then each Chest from the bad world is identified and the items it should contain are added from the matching Chest in the good world.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
void MergeEmptyChests(
Chest[] sourceChests,
Chest[] destinationChests)
{
foreach (Chest destinationChest in destinationChests)
{
foreach (Chest sourceChest in sourceChests)
{
if (destinationChest.X != sourceChest.X
&& destinationChest.Y != sourceChest.Y)
{
continue;
}

int sourceChestItemLength =
sourceChest.ItemCollection.Length;

for (int i = 0; i < sourceChestItemLength; i++)
{
destinationChest.ItemCollection[i] =
sourceChest.ItemCollection[i];
}
}
}
}

The only way to identify chests in the world is based on their location. Name possibly could be used - except named chests likely mean that the player has placed these chests intentionally. As previously stated, any chests that I had placed I wanted to do manually as I needed to compare with the chests I had placed items in after I noticed the save file corruption.

After running the application I get the following output:

13 Empty Chests in Good World out of 302 145 Empty Chests in Bad World out of 302

After Merge: 13 Empty Chests in Bad World out of 302

Perfect, I was able to restore all of the wild chests back to their generated state. The last step in the process is saving the changes.

Saving Chest Data

Everything up to this point had been relatively easy, so it was only a matter of time before I came across an issue. Unfortunately, the issue did not appear until I got to the final step of saving the merged Chest data.

The Issue

Remember how Chest data is read? Or more specifically, how the item data is read?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
Chest GetChestData(
BinaryReader worldReader,
int itemsPerChest,
int overflowItems)
{
Chest chest = new Chest(itemsPerChest)
{
X = worldReader.ReadInt32(),
Y = worldReader.ReadInt32(),
Name = worldReader.ReadString(),
};

for (int i = 0; i < itemsPerChest; i++)
{
chest.ItemCollection[i] = GetItemData(worldReader);
}

for (int i = 0; i < overflowItems; i++)
{
GetItemData(worldReader);
}

return chest;
}

Item GetItemData(BinaryReader worldReader)
{
short stackSize = worldReader.ReadInt16();

if (stackSize <= 0)
{
return null;
}

return new Item()
{
StackSize = stackSize,
Id = worldReader.ReadInt32(),
Prefix = worldReader.ReadByte(),
};
}

Do you see the issue?

The issue is that the Chest data has grown after the merge since items have been added to a Chest that require data representations for a stack size, an id, and (potentially?) a prefix. This made my plan to just overwrite the data in the Chest data section impossible because it overwrites the data in the next section and consequentially causes the section locations to be incorrect. I was able to come up with only two ways to solve this:

  1. Parse Entire WorldFile
  2. Partition WorldFile

Parse Entire WorldFile

The first solution was to add support for parsing the entire WorldFile save into the object that is being manipulated. This would prevent any data from getting overwritten but would still require the section data to be updated. Either through mathematically calculating the size of each section given the size of each representation of the data in those sections, or by updating the section positions after each section was written.

While this is probably a more robust approach in the long run, it would require a lot of effort on my part to build all of the representations for the data in each section and the corresponding code to read it.

Partition WorldFile

The second solution would segment the file into ~5 partitions. Three partitions for the data that is not changing, and two partitions for the data that needs updated (section locations and Chest data). In this way the data is accounted for without actually defining a representation for each piece of the data like the previous solution would have had to. Keep in mind that this solution is only worthwhile because the data changes that were made made were local to the Chest data section. If this were not the case, 2n+1 (where n is the number of sections) would need to be updated:

Partition NamePartition Type
MetaDataOriginal
Section LocationModified
2x SectionsOriginal
Chest DataModified
Remaining SectionsOriginal

Solution Implementation

I chose the second solution as it seemed like the easier of the two for what I was trying to accomplish. If this wasn’t a one-off project, I would suggest the first solution. To implement the second solution, all of the data from the file needs to be read. Fortunately, the data that I am not interested in can be read into a byte array. Then during the save process, the original data can be preserved by writing the contents of these byte arrays. The GetWorld function then gets updated to look like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
World GetWorld(BinaryReader worldReader)
{
World world = new World();

GetWorldMetaData(worldReader, world);
GetWorldSections(worldReader, world);

int bytesToRead = world.SectionSize[2]
- (int)worldReader.BaseStream.Position;

world.SkippedSectionsBeforeChestData =
worldReader.ReadBytes(bytesToRead);

GetChestData(worldReader, world);

bytesToRead = (int)worldReader.BaseStream.Length
- (int)worldReader.BaseStream.Position;

world.SkippedSectionsAfterChestData =
worldReader.ReadBytes(bytesToRead);

return world;
}

Now that I had all of the file data in a mechanism that prevents data loss, I could write the function to save the world file and call it from Main after the Merge.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
void Main()
{
World GoodWorld = GetWorld(@"201708310826.wld");
Console.WriteLine(
"{0} Empty Chests in Good World out of {1}",
GoodWorld.ChestCollection
.Where(chest =>
chest.ItemCollection
.All(item =>
item == null))
.Count(),
GoodWorld.ChestCollection.Length);

World BadWorld = GetWorld(@"201709010835.wld");
Console.WriteLine(
"{0} Empty Chests in Bad World out of {1}",
BadWorld.ChestCollection
.Where(chest =>
chest.ItemCollection
.All(item =>
item == null))
.Count(),
BadWorld.ChestCollection.Length);

MergeEmptyChests(
BadWorld.ChestCollection
.Where(chest =>
chest.ItemCollection
.All(item =>
item == null))
.ToArray(),
GoodWorld.ChestCollection);

Console.WriteLine();
Console.WriteLine("After Merge:");
Console.WriteLine(
"{0} Empty Chests in Bad World out of {1}",
BadWorld.ChestCollection
.Where(chest =>
chest.ItemCollection
.All(item =>
item == null))
.Count(),
BadWorld.ChestCollection.Length);

SaveWorld(BadWorld, @"201709010835-modified.wld");
}

void SaveWorld(World world, string worldFileSave)
{
using (BinaryWriter binaryWriter = new BinaryWriter(
File.Create(worldFileSave)))
{
SaveWorld(binaryWriter, world);
}
}

Just like before, this is just a utility method around the SaveWorld method that actually does all the work. The worldFileSave property is the location to save the data, I recommend making it different than the original file to prevent further data loss!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
void SaveWorld(BinaryWriter worldWriter, World world)
{
WriteWorldMetaData(worldWriter, world);

long sectionPosition = worldWriter.BaseStream.Position;

WriteWorldSections(worldWriter, world);

worldWriter.Write(world.SkippedSectionsBeforeChestData);

WriteChestData(worldWriter, world);

long newChestDataSectionPosition =
worldWriter.BaseStream.Position;

worldWriter.Write(world.SkippedSectionsAfterChestData);

int sectionSizeAdjustment =
(int)newChestDataSectionPosition - world.SectionSize[3];
for(int i = 3; i < world.SectionCount - 1; i++)
{
world.SectionSize[i] += sectionSizeAdjustment;
}

worldWriter.BaseStream.Position = sectionPosition;
WriteWorldSections(worldWriter, world);
}

This method is the exact opposite of the GetWorld method with a few extra modifications. The sectionPosition is used at the end to update the section offsets after all the data has been written since the size of the data section will be known at that point. The newChestDataSectionPosition is the local variable used to update the current section data to the new updated values. The writer then moves back to the section offset location stored in sectionPosition and writes the new section data. Each write method defined in this method is the inverse of the corresponding get method. The BinaryWriter will try to pad some of the values which is why the data is converted to the proper type before writing. This could probably be corrected by changing the underlying value in the corresponding object to the exact value needed:

1
2
3
4
5
6
7
void WriteWorldMetaData(BinaryWriter worldWriter, World world)
{
worldWriter.Write((Int32)world.Version);
worldWriter.Write((Int64)world.TypeCheck);
worldWriter.Write((Int32)world.Revision);
worldWriter.Write((Int64)world.UnknownMetaData);
}

No scary code here.

1
2
3
4
5
6
7
8
9
void WriteWorldSections(BinaryWriter worldWriter, World world)
{
worldWriter.Write((Int16)world.SectionCount);

for(int section = 0; section < world.SectionCount; section++)
{
worldWriter.Write((Int32)world.SectionSize[section]);
}
}

A little more complex, but still nothing to be concerned with.

1
2
3
4
5
6
7
8
9
10
void WriteChestData(BinaryWriter worldWriter, World world)
{
worldWriter.Write((Int16)world.TotalChests);
worldWriter.Write((Int16)world.MaxItems);

for (int i = 0; i < world.TotalChests; i++)
{
WriteChestData(worldWriter, world.ChestCollection[i]);
}
}

Similar to the WriteWorldSections but with an extra value written.

1
2
3
4
5
6
7
8
9
10
11
void WriteChestData(BinaryWriter worldWriter, Chest chest)
{
worldWriter.Write((Int32)chest.X);
worldWriter.Write((Int32)chest.Y);
worldWriter.Write(chest.Name ?? string.Empty);

for (int i = 0; i < chest.ItemCollection.Length; i++)
{
WriteItemData(worldWriter, chest.ItemCollection[i]);
}
}

The complexity is starting to build. The empty string value may not be necessary, but I did not want to change a NullReferenceException.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
void WriteItemData(BinaryWriter worldWriter, Item item)
{
if (item == null)
{
worldWriter.Write((Int16)0);
return;
}

worldWriter.Write((Int16)item.StackSize);

if (item.StackSize == 0)
{
return;
}

worldWriter.Write((Int32)item.Id);
worldWriter.Write((byte)item.Prefix);
}

The value of 0 needs to be explicitly cast down to Int16 as it is by default an Int32. This was causing my file to be larger than it needed to be at first and prevented it from being loaded.

Results

Once the file has been written to disk, it can be loaded in TEdit or Terraria to see if it can be parsed.

Alternate Solutions

Here are a couple alternative approaches to the one outlined above.

TEdit

It would have been faster to reference the TEdit executable and reuse the code they have written to parse the world file data for each instance of the save. This would have been like the first solution except I would be relying on TEdit’s implementation of it instead of rolling my own. The only code that would have needed to be written is MergeEmptyChests.

Terraria

Similar to the previous solution, it may be possible to reference the Terraria executable and reuse the code to load worlds. This would have made the first solution redundant and in hindsight is probably what I should have done in the first place. The save functionality probably does not have a Validate method check like TEdit does though.

Summary

In the end, was it worth it?

Probably not. Honestly, it would have been faster to go through the sand pit again than to research and code a solution like this. Granted, at the time I had no idea I had lost anything. At least I was able to recover from my mistake and learn something in the process.

Moral of the story? Back up your data.

Appendix A

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
void Main()
{
World GoodWorld = GetWorld(@"201708310826.wld");
Console.WriteLine(
"{0} Empty Chests in Good World out of {1}",
GoodWorld.ChestCollection
.Where(chest =>
chest.ItemCollection
.All(item =>
item == null))
.Count(),
GoodWorld.ChestCollection.Length);

World BadWorld = GetWorld(@"201709010835.wld");
Console.WriteLine(
"{0} Empty Chests in Bad World out of {1}",
BadWorld.ChestCollection
.Where(chest =>
chest.ItemCollection
.All(item =>
item == null))
.Count(),
BadWorld.ChestCollection.Length);

MergeEmptyChests(
BadWorld.ChestCollection
.Where(chest =>
chest.ItemCollection
.All(item =>
item == null))
.ToArray(),
GoodWorld.ChestCollection);

Console.WriteLine();
Console.WriteLine("After Merge:");
Console.WriteLine(
"{0} Empty Chests in Bad World out of {1}",
BadWorld.ChestCollection
.Where(chest =>
chest.ItemCollection
.All(item =>
item == null))
.Count(),
BadWorld.ChestCollection.Length);

SaveWorld(BadWorld, @"201709010835-modified.wld");
}

World GetWorld(string worldPath)
{
using (BinaryReader worldReader = new BinaryReader(
File.OpenRead(worldPath)))
{
return GetWorld(worldReader);
}
}

World GetWorld(BinaryReader worldReader)
{
World world = new World();

GetWorldMetaData(worldReader, world);
GetWorldSections(worldReader, world);

worldReader.BaseStream.Position = world.SectionSize[2];

GetChestData(worldReader, world);

return world;
}

void GetWorldMetaData(BinaryReader worldReader, World world)
{
world.Version = worldReader.ReadInt32();
world.TypeCheck = worldReader.ReadInt64();
world.Revision = worldReader.ReadInt32();
// Unknown
world.UnknownMetaData = worldReader.ReadInt64();
}

void GetWorldSections(BinaryReader worldReader, World world)
{
world.SectionCount = worldReader.ReadInt16();

for(int section = 0; section < world.SectionCount; section++)
{
world.SectionSize[section] = worldReader.ReadInt32();
}
}

void GetChestData(BinaryReader worldReader, World world)
{
world.TotalChests = worldReader.ReadInt16();
world.MaxItems = worldReader.ReadInt16();

int itemsPerChest = world.MaxItems;
int overflowItems = 0;

if (world.MaxItems > 40)
{
itemsPerChest = 40;
overflowItems = world.MaxItems - 40;
}

for (int i = 0; i < world.TotalChests; i++)
{
world.ChestCollection[i] =
GetChestData(
worldReader,
itemsPerChest,
overflowItems);
}
}

Chest GetChestData(
BinaryReader worldReader,
int itemsPerChest,
int overflowItems)
{
Chest chest = new Chest(itemsPerChest)
{
X = worldReader.ReadInt32(),
Y = worldReader.ReadInt32(),
Name = worldReader.ReadString(),
};

for (int i = 0; i < itemsPerChest; i++)
{
chest.ItemCollection[i] = GetItemData(worldReader);
}

for (int i = 0; i < overflowItems; i++)
{
GetItemData(worldReader);
}

return chest;
}

Item GetItemData(BinaryReader worldReader)
{
short stackSize = worldReader.ReadInt16();

if (stackSize <= 0)
{
return null;
}

return new Item()
{
StackSize = stackSize,
Id = worldReader.ReadInt32(),
Prefix = worldReader.ReadByte(),
};
}

void MergeEmptyChests(
Chest[] destinationChests,
Chest[] sourceChests)
{
foreach (Chest destinationChest in destinationChests)
{
foreach (Chest sourceChest in sourceChests)
{
if (destinationChest.X != sourceChest.X
&& destinationChest.Y != sourceChest.Y)
{
continue;
}

int numberOfItems =
sourceChest.ItemCollection.Length;
for (int i = 0; i < numberOfItems; i++)
{
destinationChest.ItemCollection[i] =
sourceChest.ItemCollection[i];
}
}
}
}

void SaveWorld(World world, string worldFileSave)
{
using (BinaryWriter binaryWriter = new BinaryWriter(
File.Create(worldFileSave)))
{
SaveWorld(binaryWriter, world);
}
}

void SaveWorld(BinaryWriter worldWriter, World world)
{
WriteWorldMetaData(worldWriter, world);

long sectionPosition = worldWriter.BaseStream.Position;

WriteWorldSections(worldWriter, world);

worldWriter.Write(world.SkippedSectionsBeforeChestData);

WriteChestData(worldWriter, world);

long newChestDataSectionPosition =
worldWriter.BaseStream.Position;

worldWriter.Write(world.SkippedSectionsAfterChestData);

int sectionSizeAdjustment =
(int)newChestDataSectionPosition - world.SectionSize[3];

for(int i = 3; i < world.SectionCount - 1; i++)
{
world.SectionSize[i] += sectionSizeAdjustment;
}

worldWriter.BaseStream.Position = sectionPosition;

WriteWorldSections(worldWriter, world);
}

void WriteWorldMetaData(BinaryWriter worldWriter, World world)
{
worldWriter.Write((Int32)world.Version);
worldWriter.Write((Int64)world.TypeCheck);
worldWriter.Write((Int32)world.Revision);
worldWriter.Write((Int64)world.UnknownMetaData);
}

void WriteWorldSections(BinaryWriter worldWriter, World world)
{
worldWriter.Write((Int16)world.SectionCount);

for(int section = 0; section < world.SectionCount; section++)
{
worldWriter.Write((Int32)world.SectionSize[section]);
}
}

void WriteChestData(BinaryWriter worldWriter, World world)
{
worldWriter.Write((Int16)world.TotalChests);
worldWriter.Write((Int16)world.MaxItems);

for (int i = 0; i < world.TotalChests; i++)
{
WriteChestData(worldWriter, world.ChestCollection[i]);
}
}

void WriteChestData(BinaryWriter worldWriter, Chest chest)
{
worldWriter.Write((Int32)chest.X);
worldWriter.Write((Int32)chest.Y);
worldWriter.Write(chest.Name ?? string.Empty);

for (int i = 0; i < chest.ItemCollection.Length; i++)
{
WriteItemData(worldWriter, chest.ItemCollection[i]);
}
}

void WriteItemData(BinaryWriter worldWriter, Item item)
{
if (item == null)
{
worldWriter.Write((Int16)0);
return;
}

worldWriter.Write((Int16)item.StackSize);

if (item.StackSize == 0)
{
return;
}

worldWriter.Write((Int32)item.Id);
worldWriter.Write((byte)item.Prefix);
}

class World
{
private short sectionCount = 0;

private short totalChests = 0;

public int Version { get; set; }

public long TypeCheck { get; set; }

public int Revision { get; set; }

public long UnknownMetaData { get; set; }

public short SectionCount
{
get
{
return this.sectionCount;
}

set
{
this.sectionCount = value;
this.SectionSize = new int[this.sectionCount];
}
}

public int[] SectionSize { get; private set; }

public short TotalChests
{
get
{
return this.totalChests;
}

set
{
this.totalChests = value;
this.ChestCollection = new Chest[this.totalChests];
}
}

public Chest[] ChestCollection { get; private set; }

public short MaxItems { get; set; }
}

class Chest
{
public Chest(int numberOfItemsPerChest)
{
this.ItemCollection = new Item[numberOfItemsPerChest];
}

public int X { get; set; }

public int Y { get; set; }

public string Name { get; set; }

public Item[] ItemCollection { get; private set; }
}

class Item
{
public short StackSize { get; set; }

public int Id { get; set; }

public byte Prefix { get; set; }
}