If you’re just looking for an overview and takeaways, you can skip to the TLDR section below.

One of the final technical challenges getting in the way of releasing a closed alpha build is realtime terrain generation, and more specifically, generating terrain using a coroutine or Task. Basically you need to be able to walk around on the map and not even notice that the game is generating terrain around you. Now with this being a top down game I don’t have to worry about the complexities with first person games, which involves needing to generate terrain in the direction that the player is looking. Here I can just generate terrain around the player and get away with it.

Terrain Generation using Bolt

For a quick reminder here is what the terrain generator looked like 2 weeks ago:

Click this image if you want to see a full size version in a new tab

This is completely built using Unity’s Bolt plugin which was originally developed by Ludiq. Basically I just started a Bolt macro using a custom event which then did all of the heavy lifting of doing the actual generation and I would just read from a list when the generation was complete. You can see an overview of how I did this in this blog post.

You can probably think of a swath of problems that I ran into getting this to work and it was a lot of work but this approach had a huge advantage for fast development: I didn’t need to let Unity recompile my code after I made changes. This allowed me to change core functionality of the shaders and see the changes in realtime in the editor.

However whatever points this approach gained in ease-of-use and fast iteration, it loses them all when it comes to performance. For generating just a couple of terrain tiles it was relatively fast, but for generating more it was unbearably slow. In my testing I decided that I needed a space of at least 4×4 chunks to have a nice “play area” for people to test in (16 tiles total). Anything smaller than this was not really playable because you would run out of space really quickly. What I noticed is that I’m not able to generate a 4×4 amount of space in a reasonable amount of time:

Blue Lines: The amount of time it took to generate all chunks
Red Lines: The average amount of time per chunk for that run

The Y axis is seconds, so you can see that I don’t want the average player waiting 35-40 seconds to start a new game. I realistically wanted this to be under 5 seconds. After trying many optimizations I couldn’t really get around the fact that Bolt needs to run on the main thread and also it cannot be used in a multitasking environment (like a coroutine or Task). This means the player would be stuck waiting for 35-40 seconds at the start of the game without being able to move, which wasn’t good enough. So then I decided it was time to migrate over to using the new C# Job System.

Before moving forward I just wanted to say that moving away from Bolt was actually a hard decision because I have to lose out on the fact that every time I want to change core functionality of the terrain generator, I have to wait for Unity to recompile my code. I will find uses for the Bolt plugin in the future, but for now I can't use it for terrain generation.

C# Job System

My first step in converting my Terrain Generation Macros to C# was to convert each Macro into a static C# function and try to match the functionality, which was easy enough. This step only took me about 30 minutes to get parity with the Bolt plugin. Basically each instance of a terrain generation macro got converted to a TerrainGenerationLayer which just holds the data values for that layer (and the visual texture that gets added to the terrain). The performance was comparable to just running the Bolt plugin and I still had all of the problems of only running on 1 thread and blocking the main thread. Then I took my complicated terrain generation function and converted it to a C# Job.

This is what the top-most job function call looks like:

My terrain generator call from within a C# Job, click the image for full size.

Ok, so now each TerrainGenerationLayer just calls this function in order to generate their alphamap. This is basically the same thing that the Bolt macro instances were doing except we’re doing it in parallel (they all call the same function just with different inputs).

The loop that generates each terrain layer, click for full size

The only complicated thing here is just the spooky line where I add a coroutine to a list:

coroutines.Add(StartCoroutine(terrainGenerators[idx]
.RunGeneration(chunkSettings, terrainSettings, quickAlphamaps[idx])));

This is actually not all that complicated. Basically I’m taking the terrain generation layer at the current index (idx) and I’m calling the RunGeneration function, which just starts the C# job that I posted above. This function actually returns IEnumerator, which is how Unity wants you to specify a coroutine. Then I start this coroutine using StartCoroutine and then I add the coroutine handle to the list of coroutines. If you want to learn more about coroutines, the Unity documentation is a great start.

If you’re a more advanced Unity user you might be wondering why I didn’t use a Task based approach here. That mostly has to do with the fact that in my opinion Tasks are harder to debug than coroutines, even though you have some nicer language features with tasks.

Another huge gain I got from this approach is that I can generate multiple terrain layers at the same time, this wasn’t feasible with the old approach because you can’t really force bolt to run on multiple threads and from what I can tell you can’t run Bolt macros from within the C# Job System. Now that I talked about some of the changes, lets talk quickly about the performance improvements and why this isn’t quite a fair comparison.

Performance

It took me about 2 full days to make a full transition to the C# Job System and at the end of those 2 days I thought to myself, how did I get here? Well, I had almost totally forgot how bad the performance was before I implemented the C# Job System changes. The new performance is amazing in comparison:

Blue Lines: The amount of time it took to generate all chunks
Red Lines: The average amount of time per chunk for that run

A very interesting note here is that generating 1 chunk of terrain is not actually that much faster than the Bolt implementation. Even though the bolt implementation took just over 2 seconds to generate 1 chunk the C# Job System was able to generate it in just over 1 second, I was expecting it to be 10-100x faster than the Bolt implementation, instead its just 2x as fast. Why is this? Well because even though generating a chunk can be done much faster with the C# Job System a large portion of the cost of the generation is waiting for jobs to complete and then copying the results of the job into the final destination. Copying memory to the destination array has to be done on the main thread because we need to copy the results of the terrain generation into a format that Unity can understand (which is a 3 dimensional float array).

To further this point, the first iteration of this implementation did not generate chunks in parallel and it yielded very little performance improvements over bolt (19.5 seconds vs 38 seconds). This was extremely frustrating because after 2 days of work I expected to have more to show than a ~20 second improvement, 19.5 seconds is still not anywhere near good enough. Then I figured it must be an issue with waiting for jobs to complete. Basically the jobs all start and the main thread has nothing to do but wait for them to complete. This is where generating the chunks in parallel has a huge effect, because the main thread can start working on queuing up more jobs while the current jobs are still running. This juggling act allows me to generate terrain at realtime while the player is walking around, which is really cool to see:

The player can’t even tell the terrain is generating while they are running, it just looks like the terrain is seamlessly endless.

Now as the player moves towards the edges of the map, more terrain is generated dynamically, and because the terrain generator is mostly running on other threads there is almost no performance impact on generating terrain while the player is playing.

Comparing to Bolt

As I said before that its not really fair to compare the performance of these two solutions because Bolt is not designed to be used for terrain generation. Bolt is supposed to be used as a way to describe game logic visually because in a lot of ways it is simpler and it makes more sense to describe game logic visually using graphs and nodes. Unreal Engine has the blueprint system which is a massive feature that I’ve missed after coming to Unity. After playing around with Bolt I have high hopes for what they will be able to do in Bolt 2 and I’m hoping that Unity takes Bolt in a direction similar to Unreal Engine’s blueprint system.

If Unity completely integrates Bolt into the editor and makes it a first class feature I think it would be extremely useful for all unity developers. Imagine not having to write a small script for every single UI behaviour and just being able to embed some logic directly into a GameObject. Most UI behaviours are extremely simple and aren’t used anywhere else in the codebase, why do they need to live in a script? Why can’t that logic just live in that scene for that specific object? Well if you use Bolt, you can actually just embed a macro into a Component on a GameObject and call it a day.

Conclusion (TLDR)

Some major takeaways that I think are good starting points for other people to get started with terrain generation are:

  1. As much as possible make sure that your terrain layers are not dependent on each other so that they can be generated at the same time.
  2. Try to minimize processing outside of C# jobs, even if its a simple operation that you want to perform on a terrain layer. You should throw it into a job so that the main thread can do other things.
  3. Use the profiler to see which parts of your codebase are actually eating up the most CPU time, don’t spend time optimizing things that don’t have a significant impact.
  4. Inside of C# jobs use references to structs inside of function definitions to save jobs from copying a ton of memory. If you don’t use ref, a struct is copied on every iteration of that job (see the first code above).

If anyone wants more details about a specific thing I talked about, or even stuff I didn’t talk about, a great place to chat with me is in the Discord!