FreshRSS

Normální zobrazení

Jsou dostupné nové články, klikněte pro obnovení stránky.
PředevčíremHlavní kanál
  • ✇Recent Questions - Game Development Stack Exchange
  • Octree Query - Frustum Search and Recursive Vector InsertsSharpie
    Brief I have spent probably the last year thinking about implementing an Octree data structure into my C++ game engine project for scene management and frustum culling of lights and meshes. Right now my goal is to beat the performance of my current iterative brute force approach to frustum testing every single light and mesh in the scene. I finally decided to attack this head on and have over the past week implemented a templated Octree class which allows me to store data within my octree such a
     

Octree Query - Frustum Search and Recursive Vector Inserts

Brief

I have spent probably the last year thinking about implementing an Octree data structure into my C++ game engine project for scene management and frustum culling of lights and meshes. Right now my goal is to beat the performance of my current iterative brute force approach to frustum testing every single light and mesh in the scene.

I finally decided to attack this head on and have over the past week implemented a templated Octree class which allows me to store data within my octree such as UUID (uint32_t in my case). I also plan to be able to repurpose this structure for other features in the game engine, but for now, frustum culling is my primary goal for this system.

Now down to brass tacks, I have a performance issue with std::vector::insert() and the recursive nature of my current design.

Structure

  1. Octree<typename DataType>, this is the base class which manages all API calls from the user such as insert, remove, update, query (AABB, Sphere, or Frustum), etc. When I create the Octree, the constructor takes an OctreeConfig struct which holds basic information on what properties the Octree should take, e.g., MinNodeSize, PreferredMaxDataSourcesPerNode, etc.
  2. OctreeDataSource<typename DataType>, this is a simple struct that holds an AABB bounding box that represents the data in 3D space, and the value of the DataType, e.g., a UUID. I plan to also extend this so I can have bounding spheres or points for the data types aswell.
  3. OctreeNode<typename DataType>, this is a private struct within the Octree class, as I do not want the user to access the nodes directly; however, each node has a std::array<OctreeNode<DataType>, 8> for its children, and it also holds a std::vector<std::shared_ptr<OctreeDataSource<DataType>>> which holds a vector of smart pointers to the data source.

Problem

My current issue is the performance impact of std::vector::insert() that is called recursively through the OctreeNode's when I call my Octree::Query(CameraFrustum) method.

As seen above in my structure, each OctreeNode holds an std::vector of data sources and when I query the Octree, it range inserts all of these vectors into a single pre-allocated vector that is passed down the Octree by reference.

When I query the Octree, it takes the following basic steps:

Query Method

  1. Octree::Query
    1. Create a static std::vector and ensure that on creation it has reserved space for the query (currently I am just hard coding this to 1024 as this sufficiently holds all the mesh objects in my current octree test scene, so there are no reallocations when performing an std::vector range insert).
    2. Clear the static vector.
    3. Call OctreeNode::Query and pass the vector as reference.
  2. OctreeNode::Query
    1. Check Count of data sources in current node and children, if we have no data sources in this node and it's children, we return - simples :)
    2. Conduct a frustum check on the current node AABB bounds. Result is either Contains, Intersects, or DoesNotContain.
      • Contains: (PERFORMANCE IMPACT HERE) If the current node is fully contained within the frustum, we will simply include all DataSources into the query from the current and all child nodes recursively. We call OctreeNode::GatherAllDataSources, and pass the static vector created in Octree::Query() by reference.
      • Intersects: We individually frustum check each OctreeDataSource::AABB within this node's data source vector, then we recursively call OctreeNode::Query on each of the children to perform this function recursively.

OctreeNode::GatherAllDataSources (the problem child)

I have used profiling macros to measure the accumulated amount of time this function takes each frame. If I call Query once in my main engine game loop, the GatherAllDataSources() takes roughly 60% if not more of the entire Query method time.

Octree Profile

You can also see from these profile results that the Octree Query is taking double the time as "Forward Plus - Frustum Culling (MESHES)" which is the brute force approach to frustum checking every mesh within the scene (the scene has 948 meshes with AABBs).

I've narrowed the issue down to the line of code with the comment below:

void GatherAllDataSources(std::vector<OctreeData>& out_data) {
    
    L_PROFILE_SCOPE_ACCUMULATIVE_TIMER("Octree Query - GatherAllDataSources"); // Accumulates a profile timer results each time this method is called. Profiler starts time on construction and stops timer and accumulates result within a ProfilerResults class.
    if (Count() == 0) {
        CheckShouldDeleteNode();
        return;
    }

    if (!m_DataSources.empty()) {
        // This is the line of code which is taking most of the queries search time
        // As you can see below aswell, the time complexity increases because 
        // I am calling this function recursively for all children, practically, 
        // gathering all data sources within this node and all children
        out_data.insert(out_data.end(), m_DataSources.begin(), m_DataSources.end()); 
    }               

    if (!IsNodeSplit()) 
        return;
        
    // Recursively gather data from child nodes
    for (const auto& child : m_ChildrenNodes) {
        if (child) {
            child->GatherAllDataSources(out_data); // Pass the same vector to avoid memory allocations
        }
    }       
}

Question Time

How can I significantly improve the efficiency of Gathering data sources recursively from my child nodes?

I am open to entirely changing the approach of how data sources are stored within the Octree, and how the overall structure of the Octree is designed, but this is where I get stuck.

I'm very inexperienced when it comes to algorithm optimisation or C++ optimisation, and as this is a new algorithm I have attempted to implement, I'm finding it very difficult to find a solution to this problem.

Any tips/tricks are welcome!

You can find the full version of my current Octree implementation code here (please note I am not finished yet with other functionality, and I will probably be back if I can't find solutions for Insert and Remove optimisation!).

Here are some resources I have reviewed:

If you're also interested in the rest of my code base it can be found on GitHub through this link. I mostly operate in the Development branch. These changes haven't been pushed yet, but I've faced a lot of challenges during this project's journey so if you have any further insights to my code or have any questions about how I've implemented different features, please give me a shout!

  • ✇Recent Questions - Game Development Stack Exchange
  • Adjusting texture generation inside a quadtreez3nth10n
    Well, I'm trying to adjust the position of generated textures using this approach. I replicated this in Unity3D for a complete texture of 4096x4096: The point is that I need to generate the same texture on a quadtree using different level of details / zoom per each node. I have almost the code finished, but I cannot calibrate textures inside the grid/nodes: With grid: The point is that I'm unsure at which point of the code execution, I have two main parts that I doubt about. The first one is
     

Adjusting texture generation inside a quadtree

Well, I'm trying to adjust the position of generated textures using this approach.

I replicated this in Unity3D for a complete texture of 4096x4096:

texture generated

The point is that I need to generate the same texture on a quadtree using different level of details / zoom per each node.

I have almost the code finished, but I cannot calibrate textures inside the grid/nodes:

enter image description here

With grid:

biomes with grid

The point is that I'm unsure at which point of the code execution, I have two main parts that I doubt about.

The first one is on the OnGUI() call, read comments FMI.

private void OnGUI()
{
    // Skipped logic
    var totalTimeExecution = DrawBiomeTextureLogic();
    // Skipped logic
}

private long DrawBiomeTextureLogic()
{
    var octreeKey = world.worldSettings.TargetOctreeKey;
    var regions = world.worldState.ChunkData.Get_Regions();

    if (!regions.ContainsKey(octreeKey))
        return -1;

    var region = regions[octreeKey];

    // will draw the smaller ones first (Depth = 0, 2^0=1 * textureSize)
    // but we draw everything as 16x16 because we are doing a level of detail with textures
    var map = region.NodeMap.AsEnumerable().OrderByDescending(x => x.Value.Depth);

    foreach (var (key, octree) in map)
    {
        // fields from the octree node
        var position = octree.Position;
        var depth = octree.Depth;
        var mult = 1 << depth;

        // first intPosition, needs to refactgor
        var size = world.worldSettings.ChunkSize * mult;
        var intPosition = new Vector2Int(position.x - size / 2, position.y - size / 2);

        DrawTexture(octree);
    }

    return totalTimeExecution;
}

private void DrawTexture(OctreeNode octree)
{
    // fields from the octree node
    // note: I say quadtree on the question, but because I'm skipping z axis values with a HashSet, 
    // I will create another question to create another algorithm for it
    var position = octree.Position;
    var key = octree.LocCode;
    var depth = octree.Depth;
    var mult = 1 << depth;

    var size = world.worldSettings.ChunkSize * mult;

    // use Y position because we need a bird view of the map (edit: but i need)
    // the problem could be here
    var intPosition = new Vector2Int(position.x - size / 2, position.y - size / 2);
    var targetPosition = new long2(intPosition.x, intPosition.y);

    // or maybe, but this is for adjusting inside scroll view
    var texRect = new Rect(intPosition.x, -intPosition.y - MAP_SIZE_HEIGHT, size, size);

    // pass targetPosition into second stage
    TextureSettings.InitializeTextureParams(targetPosition, depth, key);

    // Get the texture from the biomeTextureController
    var texture = biomeTextureController.GetTexture(TextureSettings, out var textureTask);

    if (texture != null)
    {
        Textures.TryAdd(new Vector2Int(intPosition.x, -intPosition.y), (texture, size));
    }

    var color = texture != null ? Color.green : Color.red;

    if (texture != null)
    {
        GUI.DrawTexture(texRect, texture);

        // not relevant
        UIUtils.DrawOutlinedRect(texRect, color, 1);
    }
}

The second stage mentioned in code (line 59), is the Job (Unity Jobs) that I use, I did it parallel with bounds checking for index and the mentioned document from Cuberite above, the call is explained here:

[BurstCompile(OptimizeFor = OptimizeFor.Performance)]
public struct BiomeTextureParallelJob : IJobParallelFor, IDisposable
{
    // colors of the texture
    public NativeArray<Color32> data;

    // used to get Biome color depending on its index (enumeration)
    [ReadOnly]
    public NativeParallelHashMap<short, Color> colorsCollection;

    public BiomeGenenerationSettings settings;
    public BiomeTextureJobSettings jobSettings;

    public long2 TargetPosition => jobSettings.TargetPosition;
    public int Depth => jobSettings.Depth;
    public int TextureSize => jobSettings.textureSize;

    public void Execute(int index)
    {
        OnStart();

        To2D(index, TextureSize, out var x, out var y);

        if (showPercentage)
            counter[threadIndex]++;

        // same octree fields calculated again
        var d = 1 << (Depth + 4);
        var p = d / TextureSize; // example: 64 / 16 = 4
        var s = TextureSize * (1 << Depth); // textureSize multiplied by its 2^n node size

        // maybe the problem start here
        // this is from the first stage
        var tx = TargetPosition.x;
        var ty = TargetPosition.y;

        // I tried some operations, but I'm unable to adjust it
        var xx = x * p; // - p / 2; // * (tx < 0 ? -1 : 1); // + p / 2;
        var yy = y * p; // - p / 2; // * (ty < 0 ? -1 : 1); // + p / 2;

        long wx = tx + xx - s / 2; // * (tx < 0 ? -1 : 1);
        long wy = ty + yy - s / 2; // * (ty < 0 ? -1 : 1);


        if (!IsInBounds(index, data.Length))
            return;

        data[index] =
            BiomeTextureUtils.CalcPixelColor(settings, colorsCollection, wx, wy, 1,
            false, AlphaChannel);
    }

    [BurstDiscard]
    public void Dispose()
    {
        try
        {
            if (data.IsCreated)
                data.Dispose();
        }
        catch { }
    }
}

As you can see the problem could be found in two places. It could be some simple, as you can see on UI, more zoomed:

UI view

I'd like to share a demo with the minimum code, I'll prepare one if needed.

Note: I'd like to add a spoiler to hide such big images. I'm sorry.

  • ✇Recent Questions - Game Development Stack Exchange
  • Octree as struct not classL Chougrani
    I was wondering (as I could not find any implementation online), if anyone here would have an idea of how to create an octree as a struct and not a class. The problem is that Octrees are usually recursive, and structs cannot have auto-inheritance. So, the code: public struct MyOctree { public MyNode RootNode; } public struct MyNode { public MyNode[] SubNodes; } would cause a cyclic reference error (nodes contain nodes). I'm trying to implement such a structure for GPU usage, and any Nu
     

Octree as struct not class

I was wondering (as I could not find any implementation online), if anyone here would have an idea of how to create an octree as a struct and not a class.

The problem is that Octrees are usually recursive, and structs cannot have auto-inheritance.

So, the code:

public struct MyOctree
{
    public MyNode RootNode;
}

public struct MyNode
{
    public MyNode[] SubNodes;
}

would cause a cyclic reference error (nodes contain nodes).

I'm trying to implement such a structure for GPU usage, and any Nullable data (e.g. a class) throw a not-supported exception on the GPU initializer.

Thanks.

❌
❌