From 8c241c170fd7a114721c2b9126bb687d32e21bd1 Mon Sep 17 00:00:00 2001 From: Xtarsia <69606701+Xtarsia@users.noreply.github.com> Date: Sat, 4 Jan 2025 22:44:17 +0000 Subject: [PATCH] Update docs --- doc/docs/shader_design.md | 124 +++++++++++++++++++++++--------------- doc/docs/tips.md | 104 ++++++++++++++++++++++---------- 2 files changed, 150 insertions(+), 78 deletions(-) diff --git a/doc/docs/shader_design.md b/doc/docs/shader_design.md index 4732e6c43..df6d9a6c0 100644 --- a/doc/docs/shader_design.md +++ b/doc/docs/shader_design.md @@ -5,102 +5,132 @@ Our shader combines a lot of ideas and code from [cdxntchou's IndexMapTerrain](h In the material, you can enable `shader_override_enabled` with an empty `shader overide` slot and it will generate the default shader code so you can follow along with this document. You can also find the minimum shader needed to enable the terrain height functionality without texturing in `addons/terrain_3D/extras/minimum.gdshader`. -At its core, the current texture painting and rendering system is a vertex painter, not a pixel painter. We paint codes at each vertex, 1m apart by default, represented as a pixel on the [Control map](controlmap_format.md). Then the shader uses its many parameters to control how each pixel between the vertices blend together. For an artist, it's not as nice to use as a multi-layer, pixel based painter you might find in photoshop, but the dynamic nature of the system does afford other benefits. +At its core, the current texture painting and rendering system is a vertex painter, not a pixel painter. We paint codes at each vertex, 1m apart by default, represented as a pixel on the [Control map](controlmap_format.md). The shader uses its many parameters to control how each pixel between the vertices blend together. For an artist, it's not as nice to use as a multi-layer, pixel based painter you might find in Photoshop, but the dynamic nature of the system does afford other benefits. + +The following describes the various elements of the shader in a linear fashion to help you understand how it works. + + +## Texture Lookup Methods + +First some terminology and notes about the various methods used to retreive a texture value. + +A `pixel` is a colored dot on your screen (aka `picture element`). A `texel` is a colored dot on a texture in memory (aka a `texture pixel`). When a grey value is read from a rock texture, it's a texel. When it is projected on a rock mesh with lighting and rendered on your screen, it's a pixel. + +The GPU does a lot of work for gamedevs when using the standard lookup function `texture()`, such as calculating which mipmap level to use and automatically interpolating surrounding texels. A lot of this work we don't want done automatically and instead do it ourselves so we can optimize or reuse some of the process. + +Here's a quick summary of potential operations we might use to retreive a texel: + +* `texture()` - We provide UVs. The GPU calculates UV derivatives and mipmap LOD, then returns an interpolated value. +* `textureGrad()` - We provide UV derivatives. The GPU calculates mipmap LOD and returns an interpolated value. +* `textureLod()` - We provide UVs and mipmap LOD. The GPU returns an interpolated value. +* `texelFetch()` - We provide UVs and mipmap LOD. The GPU returns the texel. + +`texture*()` functions interpolate from multiple samples of the texture map if linear filtering is enabled. Using either nearest filtering or texelFetch() disables interpolation. -The following describes the various elements of the shader in a linear fashion to help you understand how the various elements are used. ## Uniforms -[Terrain3DMaterial](../api/class_terrain3dmaterial.rst) exposes uniforms found in the shader, whether we put them there or you do with your own custom shader. Uniforms that begin with `_` are considered private and are not exposed. However you can access them via code. See [Tips](tips.md). You can create your own private uniforms. +[Terrain3DMaterial](../api/class_terrain3dmaterial.rst) exposes uniforms found in the shader, including any you have added. Uniforms that begin with `_` are considered private and are hidden, but you can still access them via code. See [Tips](tips.md#accessing-private-shader-variables). -These notable [Terrain3DData](../api/class_terrain3ddata.rst) variables are passed in as uniforms. There are getter functions for each. -* `_region_map`, `_region_locations` define the location and IDs of regions (sculpted areas) -* `_height_maps`, `_control_maps`, and `_color_maps` texture arrays define the elevation, textures, and colors of the terrain, indexed by region ID -* `_texture_array_albedo`, `_texture_array_normal` are the texture arrays that combine all of the individual textures, indexed by texture ID +These notable [Terrain3DData](../api/class_terrain3ddata.rst) arrays are passed in as uniforms. The API has more information on each. +* [_region_map](../api/class_terrain3ddata.rst#class-terrain3ddata-method-get-region-map), [_region_locations](../api/class_terrain3ddata.rst#class-terrain3ddata-property-region-locations) store the location and ID of each region +* [_height_maps](../api/class_terrain3ddata.rst#class-terrain3ddata-property-height-maps), [_control_maps](../api/class_terrain3ddata.rst#class-terrain3ddata-property-control-maps), and [_color_maps](../api/class_terrain3ddata.rst#class-terrain3ddata-property-color-maps) store the elevation, texture layout, and colors of the terrain, indexed by region ID +* [_texture_array_albedo](../api/class_terrain3dassets.rst#class-terrain3dassets-method-get-albedo-array-rid), [_texture_array_normal](../api/class_terrain3dassets.rst#class-terrain3dassets-method-get-normal-array-rid) store the ground textures, indexed by texture ID -## Vertex() & Supporting functions -`vertex()` is run per terrain mesh vertex drawn on the screen. +## Vertex() & Supporting Functions -First are `get_region_uv/_uv2()` which take in UV coordinates and return region coordinates, either absolute or normalized. It also returns the region ID, which is used in the map texture arrays above. +The CPU has already created flat mesh components that make up the clipmap mesh, and collision shapes with heights. The vertex shader adjusts the mesh to match the collision shape defined by the heightmap. `vertex()` is run for every vertex on these mesh components. -Optionally, world noise is inserted here, which generates fractal brownian noise to be used for background hills outside of your regions. It's an expensive visual gimmick only and does not generate collision. +Noteworthy supporting functions include `get_region_uv/_uv2()` which take in UV coordinates and return region coordinates, either real or normalized. They also return the region ID, which indexes into the texture arrays. -`get_height()` returns the value of the heightmap at the given location. If world noise is enabled, it is blended into the height here. +Within `vertex()`, the controlmap is read to determine if a vertex is a hole, the heightmap is read if valid, and if world noise should be calculated. The values are accumulated to determine the final height, and vertex normal. -Finally `vertex()` sets the UV and UV2 coordinates, and the height of the mesh vertex. Elsewhere the CPU creates flat mesh components and a collision mesh with heights. Here is where the flat mesh vertices have their heights set to match the collision mesh. +If the optional world noise is enabled, it generates fractal brownian noise which can be used as background hills outside of your regions. It's a visual only effect, can be costly at high octaves, and does not generate collision. -## Fragment() +As `render_mode skip_vertex_transform` is used, we apply the necessary matrix transforms to set the final `VERTEX` position, matching the collision mesh. -`fragment()` is run per terrain pixel drawn on the screen. -### Normal calculation +## Fragment() -The first step is calculating the terrain normals. This shared between the `vertex()` and `fragment()` functions. Clipmap terrain vertices spread out at lower LODs causing certain things like normals look strange when you look in the distance as the vertices used for calculation suddenly separate at further LODs. So we switch from normals calculated per vertex to per pixel when the pixel is farther than `vertex_normal_distance`. +`fragment()` is run for every screen pixel in which the terrain mesh appears on screen. This is many more times than the number of vertices. -The exact distance that the transition from per vertex to per pixel normal calculations occurs can be adjusted from the default of 192m via the `vertex_normals_distance` uniform. -Generating normals in the shader works fine and modern GPUs can handle the load of 2 - 3 additional height lookups and the on-the-fly calculations. Doing this saves 3-4MB VRAM per region. +### Grid Offsets, Weights and Derivatives -### Grid creation +Features like UV rotation, UV scale, detiling, and projection break the continuity of texture UVs. So we must use `textureGrad()` and provide the derivatives for it. We take 1 set of `dfdx(uv)` and `dFdy(uv)` saved in `base_derivatives` and then scale them as needed. -We create a grid 1 unit wide using the `mirror` and `index00UV`-`index11UV` variables. This defines 4 fixed points around the current pixel. On LOD0 this grid aligns with both the mesh vertices and the control map pixels. However, they don't align further out on lower LODs or beyond the regions (sculpted areas) where the vertices are spread out. Pixel processing out there still occurs based on this 1-unit grid. +The lookup grid and blend weights are initially calculated here, as they are used for both normals, and material lookups. -This is `ALBEDO = vec3(mirror.xy, 0.)`, showing horizontal stripes in red, and vertical stripes in green, with the vertices highlighted. The inverse is stored in `mirror.zw`, so where horizontal stripes alternate red, black, red, they are now black, red, black. +To see the grid, add this at the end of the shader `ALBEDO *= vec3(round(weight), 0.0) + .5;` which shows horizontal stripes in red, and vertical stripes in green. The inverse is stored in `invert`, so where horizontal stripes alternate `red, black, red`, this has `black, red, black`. You can see the vertices if you enable `Debug Views / Vertex Grid`, as shown. ```{image} images/sh_mirror.png :target: ../_images/sh_mirror.png ``` -Next, the control maps are queried for each of the 4 grid points and stored in `control00`-`control11`. The control map bit packed format is defined in [Controlmap Format](controlmap_format.md). +A determination is made with the base derivatives, of whether it is reasonable to skip all additional lookups required to do the bilinear blend. Skipping this can save a significant amount of bandwidth and processing for the GPU depending on how much of the screen is occupied by distant terrain. It's worth noting that as this is calculated from screen space derivatives, it is independent of screen resolution. -The textures at each point are looked up and stored in an array of structs. If there is an overlay texture, the two are height blended here. Then the pixel position within the 4 grid points is bilinear interpolated to get the weighting for the final pixel values. -_Side note_: Interestingly, since the grid aligns with vertices on LOD0, there is potential for optimizations with control/texture lookups and normal calculations. For LOD0 only, all three could be looked up in `vertex()` instead of per pixel. Sadly, this breaks down on lower LODs due to the vertices spreading apart. It would be interesting to experiment with a very large LOD0 and see if the savings on lookups per pixel outweighs the more complex mesh, since the pixel shader is a lot slower than drawing vertices. Perhaps that efficiency would also be worth replacing the entire clipmap nature of the terrain system with another method to overcome other issues. +### Normal Calculation -### Texture Sampling - Splat map vs Index map +The next step is calculating the terrain normals. Clipmap terrain vertices are farther apart at lower LODs, causing certain things like normals to look strange when viewed in the distance. Because of this, we calculate normals by taking derivatives from the heightmap in `fragment()`. -It is trivial to store one texture value for any given location on a control map. However, what if we want to blend two or more textures at that point? How do we identify up to 16 or 32 textures? How do we blend them? +We use `texelFetch()` to read the height values on each vertex without any automatic interpolation. These values are used to generate a set of normals per-index, and an interpolated value for smooth normals. Using `texture()` here would not only trigger many additional lookups of adjacent vertices for interpolation, but also create artifacts when interpolating across region boundaries. -This analysis compares the *splat map* method used by many other terrain systems with an *index map* method, used by Terrain3D and [cdxntchou's IndexMapTerrain](https://github.com/cdxntchou/IndexMapTerrain). +Generating normals in the shader works fine, and modern GPUs can easily handle the load of the additional height lookups and the on-the-fly calculations. Doing this saves 3-4MB VRAM per region (sized at 1024) instead of pre-generating a normal map and passing it to the shader. + + +### Material Creation + +Next, the control maps are queried for each of the 4 grid points and stored in `control[0]`-`control[3]`. The control map bit packed format is defined in [Controlmap Format](controlmap_format.md). + +The textures at each point are looked up and stored in an array of structs. If there is an overlay texture, the two are height-blended here. Then the pixel position within the 4 grid points is bilinearly interpolated to get the weighting for the final pixel values. -At their core, all height map based terrain tools are just fancy painting applications. For texturing, the reasonable approach is to paint each texture on the control map and blend with opacity exactly as one would in Photoshop or Krita. This is how a splat map works, but instead of painting with just RGBA, it paints with RGBACDEFHIJKLMNO (for 16 textures). The unreasonable approach would be to use an entirely different methodology in order to reduce memory or increase speed. +Where possible, texture lookups are branched and in some cases only 2 samples are required, bringing VRAM bandwith requirements to a minimum. -**Splat map approach** specifies 4 textures at each terrain vertex per splat map, each vertex consuming 32-bits as RGBA. Each RGBA value represents a texture strength of 0-255. For 16 textures this method stores 4 splat maps. This means each vertex in the terrain uses 4 bytes per 4 splat maps, for a total of 16 bytes for texture values. Double for 32 textures. -All splat maps are sampled, and of the 16-32 values, the 4 strongest are blended together, per terrain pixel. The blending of textures for pixels drawn between vertices is handled by the GPU's linear interpolated texture filter during texture lookups. +### Texture Sampling - Splat Map vs Index Map -**Index map approach** samples a control map at 4 fixed grid points surrounding the current terrain pixel. The 4 surrounding samples have a base texture, an overlay texture, and a blending value. The base and overlay texture values range from 0-31, each stored in 5 bits. The blend values are stored in 8-bits. +This analysis compares the *splat map* method used by many other terrain systems with an *index map* method, used by Terrain3D and [cdxntchou's IndexMapTerrain](https://github.com/cdxntchou/IndexMapTerrain). + +At their core, all height map based terrain tools are just fancy painting applications. For texturing, the "reasonable" approach would be to define a strength value for each texture ID at each pixel and blend them together as occurs when painting in Photoshop. Subtly brushing with red gradually increases the R in the RGB value of the pixel. This is how a splat map works, but instead of painting with just RGBA, it paints with RGBACDEFHIJKLMNO (for 16 textures). The "unreasonable" approach would be to use an entirely different methodology in order to reduce memory or increase speed. + +The **Splat map approach** specifies an 8-bit strength value for each texture. 16 textures fits into 4 splat maps each made up of 32-bit RGBA values, for a total of 16 bytes. Double for 32 textures. -*Side note:* Storing blend values in 3-bits is possible, where each of the 8 numbers represents an index in a array of 0-1 values: `{ 0.0f, .125f, .25f, .334f, .5f, .667f, .8f, 1.0f }`. In the future, this may be baked at runtime. However, editing using a 3-bit array of fixed values was exceedingly difficult and unsuccessful. +When rendering, all splat maps are sampled, and of the 16-32 values, the 4 strongest are blended together, per terrain pixel. The blending of textures for pixels drawn between vertices is handled by the GPU's linear interpolated texture filter during texture lookups. -The position of the pixel within its grid square is used to bilinear interpolate the values of the 4 surrounding samples. We disable the default GPU interpolation on texture lookups and interpolate ourselves here. +The **Index map approach** samples a control map at 4 fixed grid points surrounding the current terrain pixel. The 4 surrounding samples have a base texture, an overlay texture, and a blending value. The base and overlay texture values range from 0-31, each stored in 5 bits. The blend values are stored in 8-bits. + +The position of the pixel within its grid square is used to bilinearly interpolate the values of the 4 surrounding samples. We disable the default GPU interpolation on texture lookups and interpolate ourselves here. At distances where the the bilinear blend would occur across only 1 pixel in sceen space, the blend is skipped, requiring only 1/4 of the normal samples. **Comparing the two methods:** * **Texture lookups** - Considering only lookups for which ground texture to use and loading the texture data: * Splat maps use 12-16 lookups per pixel depending on 16 or 32 textures: - * 4 for the 4 splat maps for 16 textures. 8 for 32 textures. This gets the texture for the closest vertex point + * 4-8 to get the 4 strongest texture IDs. 4 for 16 textures, 8 for 32. This retreives the texture ID for the closest vertex point. * 8 for the strongest 4 albedo_height textures and the 4 normal_rough textures - * Terrain3D uses 12-20 lookups per pixel depending on if an area has an overlay texture: - * 4 for the surrounding 4 grid points on the control map - * 8-16 for the 2-4 albedo_height & normal_rough for the base and overlay textures, for each of the 4 grid points + * Terrain3D uses 5-20 lookups per pixel depending on terrain distance: + * 1-4 for the surrounding 4 grid points on the control map. + * 4-16 for the 2-4 albedo_height & normal_rough for the base and overlay textures, for each of the 4 grid points. + +* **VRAM consumed** + * Splat maps store 16 texture strength values in 16 bytes per pixel, or 32 in 32 bytes per pixel. On a 4096k terrain with 16M pixels, splat maps consume 256MB for 16 textures, 512MB for 32. + * Terrain3D stores 32 texture strengths in 18-bits. 5-bit base ID, 5-bit overlay ID, 8-bit blend value. We can store texture layout for 32 textures on a 4k terrain in only 36MB, for a 93% reduction in VRAM. + -* **VRAM consumed** - Splat maps store 16 texture strength values in 16 bytes per pixel (16 * 8 bits = 128 bits). We could store that in 16 bits. Splat maps with 32 textures would require 32 bytes per pixel. We store that in 18 bits (5 base, 5 overlay, 8 blend value). On a 4096 x 4096 terrain with 16M pixels, splat maps consume 256MB for 16 textures, 512MB for 32. We can specify 32 textures in only 36MB for a 93% reduction in VRAM. This calculation considers only the portion of the maps that define where to place the textures on the terrain. Tools use up a lot more VRAM for other things. +The calculations above consider only the portion of lookups and VRAM used by the data that defines where textures are place on the terrain. In practical use there are many other features that greatly adjust both. -**In practice** +As for usage of the two techniques: -* Splat maps - 4 textures can be blended intuitively as one would paint in photoshop. Some systems might introduce artifacts when 3-4 textures are blended in an area. +* Splat maps - 4 textures can be blended intuitively as one would paint in Photoshop. Some systems might introduce artifacts when 3-4 textures are blended in an area. -* Terrain3D - Getting 3 or 4 textures in an area is feasible as long as only 2 textures are blending per grid point (vertex). It's possible to achieve a natural looking result with [the right technique](texture_painting.md#manual-painting-technique). +* Terrain3D - Only 2 textures can stored in a vertex. However pixels are interpolated between the 4 adjacent vertices, so can easily blend between up to 4 textures based on painted blend value, height textures, and material settings. Thus getting a natural looking blend is easily doable if textures are properly setup with heights, using [the right technique](texture_painting.md#manual-painting-technique). -### Calculating weights -Since this pixel exists within four points on a grid, we can use bilinear interpolation to calculate weights based on how close we are to the grid points. e.g. The current pixel is 75% to the next X and 33% to the next Y. We combine these values with the blended texture height value to calculate our final weights. +### Calculating Weights & Applying PBR -### Applying PBR +Since each terrain pixel exists within four points on a grid, we can use bilinear interpolation to calculate weights based on how close we are to each grid point. e.g. The current pixel is 75% to the next X and 33% to the next Y, which gives us a weighted strength for the texture values from each adjacent point. We lookup the 4 adjacent textures, take the weighted average, and apply height blending to calculate our final value. -Lastly, we calculate our final PBR values by using a weighted average of the four surrounding grid points. -The color map is looked up. Then all PBR values are sent to the GPU. +The color map and macro variation are multiplied onto the albedo channel. Then all PBR values are sent to the GPU. diff --git a/doc/docs/tips.md b/doc/docs/tips.md index 74dac14e5..82db7215e 100644 --- a/doc/docs/tips.md +++ b/doc/docs/tips.md @@ -97,58 +97,100 @@ This also works with the control and color maps. Here's an example of using a custom texture map for one texture, such as adding an emissive texture for lava. Add in this code and add an emissive texture, then adjust the emissive ID to match the lava texture, and adjust the strength. -Add the uniforms at the top of the file: +Add these uniforms at the top of the file with the other uniforms: ```glsl uniform int emissive_id : hint_range(0, 31) = 0; uniform float emissive_strength = 1.0; -uniform sampler2D emissive_tex : source_color, filter_linear_mipmap_anisotropic; +uniform sampler2D emissive_tex : source_color, filter_linear_mipmap_anisotropic, repeat_enable; ``` -Modify the return struct to house the emissive texture. +Add a variable to store emissive value in the Material struct. ```glsl -struct Material { +// struct Material { ... vec3 emissive; -}; +// }; ``` -Modify `get_material()` to read the emissive texture. +Modify `get_material()` to read the emissive texture with the next several options. + +Add the initial value for emissive by adding a vec3 at the end ```glsl -// Add the initial value for emissive, adding the last vec3 -out_mat = Material(vec4(0.), vec4(0.), 0, 0, 0.0, vec3(0.)); - -// Immediately after albedo_ht and normal_rg get assigned: -// albedo_ht = ... -// normal_rg = ... -vec4 emissive = vec4(0.); -if(out_mat.base == emissive_id) { - emissive = texture(emissive_tex, matUV); -} +// void get_material(vec2 base_uv, ... + out_mat = Material(vec4(0.), vec4(0.), 0, 0, 0.0, vec3(0.)); +``` + +Look for this conditional: +```glsl + if (out_mat.blend > 0.) { +``` + +Right before that, add: +```glsl + vec4 emissive = vec4(0.); + if(out_mat.base == emissive_id) { + emissive = textureGrad(emissive_tex, matUV, dd1.xy, dd1.zw); + } -// Immediately after albedo_ht2 and normal_rg2 get assigned: -// albedo_ht2 = ... -// normal_rg2 = ... -vec4 emissive2 = vec4(0.); -emissive2 = texture(emissive_tex, matUV2) * float(out_mat.over == emissive_id); +// if (out_mat.blend > 0.) { +``` -// Immediately after the calls to height_blend: -// albedo_ht = height_blend(... -// normal_rg = height_blend(... -emissive = height_blend(emissive, albedo_ht.a, emissive2, albedo_ht2.a, out_mat.blend); +At the end of that block, before the `}`, add: +```glsl + vec4 emissive2 = vec4(0.); + emissive2 = textureGrad(emissive_tex, matUV2, dd2.xy, dd2.zw) * float(out_mat.over == emissive_id); + emissive = height_blend(emissive, albedo_ht.a, emissive2, albedo_ht2.a, out_mat.blend); -// At the bottom of the function, just before `return`. -out_mat.emissive = emissive.rgb; +// } ``` -// Then at the very bottom of `fragment()`, before the final }, apply the weighting and send it to the GPU. +At the end of the `get_material()` function, add the emissive value to the material ```glsl -vec3 emissive = weight_inv * ( +// out_mat.alb_ht = albedo_ht; +// out_mat.nrm_rg = normal_rg; + out_mat.emissive = emissive.rgb; +// return; +// } +``` + +At the very bottom of `fragment()`, before the final `}`, apply the weighting and send it to the GPU. +```glsl +vec3 emissive = mat[0].emissive * weights.x + mat[1].emissive * weights.y + mat[2].emissive * weights.z + - mat[3].emissive * weights.w ); + mat[3].emissive * weights.w ; EMISSION = emissive * emissive_strength; + +// } +``` + +Next, add your emissive texture to the texture sampler and adjust the values on the newly exposed uniforms. + + +### Avoid sub branches + +Avoid placing an if statement within an if statement. Enable your FPS counter so you can test as you build your code. Some branch configurations may be free, some may be very expensive, or even more performant than you expect. Always test. + +Sometimes it's faster to always calculate than it is to branch. + +Sometimes you can do tricks like this to avoid sub branching: + +```glsl +uniform bool auto_shader; +if (height > 256) { + if (auto_shader) { + albedo = snow_color; + } +} +``` + +```glsl +uniform bool auto_shader; +if (height > 256) { + albedo = float(!auto_shader)*albedo + float(auto_shader)*snow_color; +} ``` -Note: Avoid sub branches: an if statement within an if statement, and enable your FPS counter so you can test as you build your code. Some branch configurations may be free, some may be very expensive, or even more performant than you expect. +These two are equivalent, and avoids the sub branch by always calculating. If auto_shader is true, the line is `albedo = 0.*albedo + 1.*snow_color`.