FreshRSS

Normální zobrazení

Jsou dostupné nové články, klikněte pro obnovení stránky.
PředevčíremHlavní kanál
  • ✇Recent Questions - Game Development Stack Exchange
  • Octree Query - Frustum Search and Recursive Vector InsertsSharpie
    Brief I have spent probably the last year thinking about implementing an Octree data structure into my C++ game engine project for scene management and frustum culling of lights and meshes. Right now my goal is to beat the performance of my current iterative brute force approach to frustum testing every single light and mesh in the scene. I finally decided to attack this head on and have over the past week implemented a templated Octree class which allows me to store data within my octree such a
     

Octree Query - Frustum Search and Recursive Vector Inserts

Brief

I have spent probably the last year thinking about implementing an Octree data structure into my C++ game engine project for scene management and frustum culling of lights and meshes. Right now my goal is to beat the performance of my current iterative brute force approach to frustum testing every single light and mesh in the scene.

I finally decided to attack this head on and have over the past week implemented a templated Octree class which allows me to store data within my octree such as UUID (uint32_t in my case). I also plan to be able to repurpose this structure for other features in the game engine, but for now, frustum culling is my primary goal for this system.

Now down to brass tacks, I have a performance issue with std::vector::insert() and the recursive nature of my current design.

Structure

  1. Octree<typename DataType>, this is the base class which manages all API calls from the user such as insert, remove, update, query (AABB, Sphere, or Frustum), etc. When I create the Octree, the constructor takes an OctreeConfig struct which holds basic information on what properties the Octree should take, e.g., MinNodeSize, PreferredMaxDataSourcesPerNode, etc.
  2. OctreeDataSource<typename DataType>, this is a simple struct that holds an AABB bounding box that represents the data in 3D space, and the value of the DataType, e.g., a UUID. I plan to also extend this so I can have bounding spheres or points for the data types aswell.
  3. OctreeNode<typename DataType>, this is a private struct within the Octree class, as I do not want the user to access the nodes directly; however, each node has a std::array<OctreeNode<DataType>, 8> for its children, and it also holds a std::vector<std::shared_ptr<OctreeDataSource<DataType>>> which holds a vector of smart pointers to the data source.

Problem

My current issue is the performance impact of std::vector::insert() that is called recursively through the OctreeNode's when I call my Octree::Query(CameraFrustum) method.

As seen above in my structure, each OctreeNode holds an std::vector of data sources and when I query the Octree, it range inserts all of these vectors into a single pre-allocated vector that is passed down the Octree by reference.

When I query the Octree, it takes the following basic steps:

Query Method

  1. Octree::Query
    1. Create a static std::vector and ensure that on creation it has reserved space for the query (currently I am just hard coding this to 1024 as this sufficiently holds all the mesh objects in my current octree test scene, so there are no reallocations when performing an std::vector range insert).
    2. Clear the static vector.
    3. Call OctreeNode::Query and pass the vector as reference.
  2. OctreeNode::Query
    1. Check Count of data sources in current node and children, if we have no data sources in this node and it's children, we return - simples :)
    2. Conduct a frustum check on the current node AABB bounds. Result is either Contains, Intersects, or DoesNotContain.
      • Contains: (PERFORMANCE IMPACT HERE) If the current node is fully contained within the frustum, we will simply include all DataSources into the query from the current and all child nodes recursively. We call OctreeNode::GatherAllDataSources, and pass the static vector created in Octree::Query() by reference.
      • Intersects: We individually frustum check each OctreeDataSource::AABB within this node's data source vector, then we recursively call OctreeNode::Query on each of the children to perform this function recursively.

OctreeNode::GatherAllDataSources (the problem child)

I have used profiling macros to measure the accumulated amount of time this function takes each frame. If I call Query once in my main engine game loop, the GatherAllDataSources() takes roughly 60% if not more of the entire Query method time.

Octree Profile

You can also see from these profile results that the Octree Query is taking double the time as "Forward Plus - Frustum Culling (MESHES)" which is the brute force approach to frustum checking every mesh within the scene (the scene has 948 meshes with AABBs).

I've narrowed the issue down to the line of code with the comment below:

void GatherAllDataSources(std::vector<OctreeData>& out_data) {
    
    L_PROFILE_SCOPE_ACCUMULATIVE_TIMER("Octree Query - GatherAllDataSources"); // Accumulates a profile timer results each time this method is called. Profiler starts time on construction and stops timer and accumulates result within a ProfilerResults class.
    if (Count() == 0) {
        CheckShouldDeleteNode();
        return;
    }

    if (!m_DataSources.empty()) {
        // This is the line of code which is taking most of the queries search time
        // As you can see below aswell, the time complexity increases because 
        // I am calling this function recursively for all children, practically, 
        // gathering all data sources within this node and all children
        out_data.insert(out_data.end(), m_DataSources.begin(), m_DataSources.end()); 
    }               

    if (!IsNodeSplit()) 
        return;
        
    // Recursively gather data from child nodes
    for (const auto& child : m_ChildrenNodes) {
        if (child) {
            child->GatherAllDataSources(out_data); // Pass the same vector to avoid memory allocations
        }
    }       
}

Question Time

How can I significantly improve the efficiency of Gathering data sources recursively from my child nodes?

I am open to entirely changing the approach of how data sources are stored within the Octree, and how the overall structure of the Octree is designed, but this is where I get stuck.

I'm very inexperienced when it comes to algorithm optimisation or C++ optimisation, and as this is a new algorithm I have attempted to implement, I'm finding it very difficult to find a solution to this problem.

Any tips/tricks are welcome!

You can find the full version of my current Octree implementation code here (please note I am not finished yet with other functionality, and I will probably be back if I can't find solutions for Insert and Remove optimisation!).

Here are some resources I have reviewed:

If you're also interested in the rest of my code base it can be found on GitHub through this link. I mostly operate in the Development branch. These changes haven't been pushed yet, but I've faced a lot of challenges during this project's journey so if you have any further insights to my code or have any questions about how I've implemented different features, please give me a shout!

❌
❌