Step by step to dense vegetation and acceptable FPS
Since the last post I’ve been working on the terrain’s vegetation. Mainly I focused on generating grass that bends in the wind and some fern like plants, but what comes next is usable for all kind of meshes. The big problem was (and still is in some cases):
HOW CAN WE GENERATE DENSE VEGETATION AND PLACE IT FAST AND WITHOUT TO MUCH FPS DROP AFTER THE GENERATION PROCESS ?
I have to admit it was frustrating to work with a machine that cannot handle much heavy calculations at once (my Macbook from 2011) from time to time, but on the other hand this forced me to get into optimizing the game very soon, what might become a good base for expanding the complexity in near future.
Draw call batching
First off: I learned a lot about Unity’s build in batching system. Batching means to combine mesh objects that share the same material or that are marked as static in the Unity inspector.
The whole point here is that it reduces the amount of draw calls.
A draw call is the amount of materials on objects that are drawn each frame in the scene.
- 100 not batched objects all having the same material = 100 draw calls
- 100 batched objects with the same material = 1 draw call
Batching makes the more sense, the more objects you have on screen at the same time. So, if the goal of this chapter is dense looking vegetation on the landscape’s hills – batching is definitely the way to go. Please note that the fact how well your computer handles draw calls and how much they affect the FPS (frames per second) is depending on your hardware. More details can be found under Unity’s official documentation page for Draw Call Batching:
Draw calls are often resource-intensive, with the graphics API doing significant work for every draw call, causing performance overhead on the CPU side. This is mostly caused by the state changes done between the draw calls (such as switching to a different Material), which causes resource-intensive validation and translation steps in the graphics driver.
Unity uses two techniques to address this:
- Dynamic batching: for small enough Meshes, this transforms their vertices on the CPU, groups many similar vertices together, and draws them all in one go.
- Static batching: combines static (not moving) GameObjects into big Meshes, and renders them in a faster way.
Watch the clip in this link if you’re not familiar or just interested in understanding the difference between CPU and GPU. Short: CPU has just a few big calculation units (cores) that work very well sequential but not parallel while GPU has thousands of small cores that run well parallel but not sequential.
The core experience I made with Unity’s batching system: it works. But there are way better solutions for not much money available. In my case I had terrible FPS with just some thousand mesh instances.
Batching on runtime
Another critical thought was that I want to batch the vegetation meshes and some other meshes on runtime while playing the game. Doing it before starting the actual game would mean to already generate all the content that needs to be batched at the beginning, which is no solution for a theoretically “endless procedurally generated” terrain. Therefor I bought a product from Unity’s Asset Store called Runtime Mesh Batcher (Which seems no longer to be available). It was easy to set up and on low cost. The creator Mr. Georges Dimitrov form Concordia University was a huge help, as I explained him by mail some trouble I got when using the asset on Unity version 2017.3.1f1. He updated the asset one day later which I consider great support. The main problem that remained and I discovered using the Profiler was that batching the meshes on runtime generates extra vertices and triangles. These are not physically there but get allocated in memory during the combining process - which is NEVER freed completely by Unity until the application is quit.
You can see the amount of vertices go up while the batches go rapidly down in the green framed areas. This was recorded during the batching process.
If you just need to batch once or twice during runtime even a high number of objects (in a demo scene he uses 10k mesh objects) the FPS will increase immensely with the code provided by Mr. Dimitrov, but if you do it over and over again I recommend MeshCombineStudio. It is slightly more expensive (it was on sale when I got it – Yehaw!) but you have the option to batch whenever you want without adding extra verts and tris which can be checked in the stats window in the editor. You can even delete triangles that are never visible by the player and delete vertices under certain layers. The only thing I couldn’t really understand immediately and in my opinion, isn’t marked very clear, is that all meshes one wants to batch need to be child objects of a defined parent. MeshCombineStudio will, unlike RuntimeMeshBatcher that does a better job in this point, only search for batchable objects in this one parent. But you can have multiple MeshCombineStudio instances as a workaround option.
Nevertheless, I also spent some money on the Advanced Foliage Shaders v.5. Simply to get rid of the annoying fact of not being able to handle the grass geometry shader I wrote about in the last post, due to my poor CG programming knowledge. This asset contains great foliage shaders with lot of options and finally some nice wind bending for the leaves and even touch bending. That’s it. When batched, the animations from the Advanced Foliage Shader go a bit crazy and make the grass float around. I just toggled the “Baked Pivots” option in the shader to ON. This caused the grass to lay flat on the ground as long as the batching process is not completed, but this isn’t visible from the distance anyways and eliminates the floating. The fern I made needed a bit more of fine tuning. It simply depends on the mesh and your individual set up.
Remodel 3D meshes (in Cinema4D)
Something that seems to be obvious but I cannot point out enough is: KEEP YOUR MESH VERTS LOW! I figured out that I was trying to spawn 60k meshes with an average of around 1000 vertices. Draw calls without batching were around 80 million. No wonder I had less than 0.6 FPS on my machine. So, I started to remodel some meshes in Cinema4D, my prefered 3D modelling software.
Some very basic tips:
- Rather be sure how much polygons you need before-hands and keep it as low as possible
- Use bump-, or normalmaps as well as displacement maps instead of modelling every detail
- The polygon reduction object from Cinema4D does not reduce the polygon count effectively
- Subdividing the mesh in Cinema4D’s sculpting mode and baking a displacement map works great
For example, this is how a regular grass mesh would look when exported as fbx from Cinema4D into Unity. And it has just twelve vertices. The foliage shader has an alpha cutout value and uses a texture + culling set to off (it is visible from both sides of the planes).
Of course, the fern mesh has many more than the grass model, but just keep it as low as possible.
Just draw the meshes instanced - fast as hell but limited
Now that the batching works I decided to come back to a Script I got from World Of Zero’s Youtube-tutorial that actually is an old version of the current grass generator script. It uses the Graphics.DrawMeshInstanced() function to draw a maximal amount of 1023 meshes (the max. number it can handle) at time with by GPU instancing.
- GPU instancing: Use GPU Instancing to draw (or render) multiple copies of the same Mesh at once, using a small number of draw calls. It is useful for drawing objects such as buildings, trees and grass, or other things that appear repeatedly in a Scene. (read more)
The solution to its limitation was to have several game objects with each 1023 instances of the mesh and add them to one parent which acts as a spawnable prefab at each chunk position just like in the post about placement of objects I made. All these techniques combined already give an idea of living, dense nature.
De/activate vegetation and meshes on chunks
To optimize the generation process it makes sense to first batch, and then deactivate assets that are to far away to be noticeable for the player. Therefor one can use a threshold to determine the distance between each chunk and the player and set the values when de-, or activation should happen through gameObject.SetActive(true/false).
Summing it up I can run the game with:
(2 x 60k grass instances by grass geometry shader) + (16 x 1023 mesh instances by graphics.drawmeshinstance function) + (2 x 60k gras mesh placement with raycasting) + (more than 5000 x more complex fern meshs with raycasting) + (2 x 10 stone instances through physical placement) + (I guess around trees 300 with vertices height check) = around 250k meshes PER CHUNK!
Which is quite nice. Especially because the FPS are on an acceptable level. Even when running on older hardware.
Next time I will dig more into the designing part. The topics will be the creation of creative content and how coincidence can help to get fresh ideas. I illustrate characters and present techniques, and my own attempts with them, using just a pen. The post will be much more about art and illustration in general and it may contain some story-line ideas for the game. See you!