Recent Questions - Game Development Stack Exchange
Octree Query - Frustum Search and Recursive Vector InsertsSharpie
Brief I have spent probably the last year thinking about implementing an Octree data structure into my C++ game engine project for scene management and frustum culling of lights and meshes. Right now my goal is to beat the performance of my current iterative brute force approach to frustum testing every single light and mesh in the scene. I finally decided to attack this head on and have over the past week implemented a templated Octree class which allows me to store data within my octree such a
11. Srpen 2024 v 15:54

Octree Query - Frustum Search and Recursive Vector Inserts

Recent Questions - Game Development Stack Exchange

Od: Sharpie

11. Srpen 2024 v 15:54

Brief

I have spent probably the last year thinking about implementing an Octree data structure into my C++ game engine project for scene management and frustum culling of lights and meshes. Right now my goal is to beat the performance of my current iterative brute force approach to frustum testing every single light and mesh in the scene.

I finally decided to attack this head on and have over the past week implemented a templated Octree class which allows me to store data within my octree such as UUID (uint32_t in my case). I also plan to be able to repurpose this structure for other features in the game engine, but for now, frustum culling is my primary goal for this system.

Now down to brass tacks, I have a performance issue with std::vector::insert() and the recursive nature of my current design.

Structure

Octree<typename DataType>, this is the base class which manages all API calls from the user such as insert, remove, update, query (AABB, Sphere, or Frustum), etc. When I create the Octree, the constructor takes an OctreeConfig struct which holds basic information on what properties the Octree should take, e.g., MinNodeSize, PreferredMaxDataSourcesPerNode, etc.
OctreeDataSource<typename DataType>, this is a simple struct that holds an AABB bounding box that represents the data in 3D space, and the value of the DataType, e.g., a UUID. I plan to also extend this so I can have bounding spheres or points for the data types aswell.
OctreeNode<typename DataType>, this is a private struct within the Octree class, as I do not want the user to access the nodes directly; however, each node has a std::array<OctreeNode<DataType>, 8> for its children, and it also holds a std::vector<std::shared_ptr<OctreeDataSource<DataType>>> which holds a vector of smart pointers to the data source.

Problem

My current issue is the performance impact of std::vector::insert() that is called recursively through the OctreeNode's when I call my Octree::Query(CameraFrustum) method.

As seen above in my structure, each OctreeNode holds an std::vector of data sources and when I query the Octree, it range inserts all of these vectors into a single pre-allocated vector that is passed down the Octree by reference.

When I query the Octree, it takes the following basic steps:

Query Method

Octree::Query
1. Create a static std::vector and ensure that on creation it has reserved space for the query (currently I am just hard coding this to 1024 as this sufficiently holds all the mesh objects in my current octree test scene, so there are no reallocations when performing an std::vector range insert).
2. Clear the static vector.
3. Call OctreeNode::Query and pass the vector as reference.
OctreeNode::Query
1. Check Count of data sources in current node and children, if we have no data sources in this node and it's children, we return - simples :)
2. Conduct a frustum check on the current node AABB bounds. Result is either Contains, Intersects, or DoesNotContain.
  - Contains: (PERFORMANCE IMPACT HERE) If the current node is fully contained within the frustum, we will simply include all DataSources into the query from the current and all child nodes recursively. We call OctreeNode::GatherAllDataSources, and pass the static vector created in Octree::Query() by reference.
  - Intersects: We individually frustum check each OctreeDataSource::AABB within this node's data source vector, then we recursively call OctreeNode::Query on each of the children to perform this function recursively.

OctreeNode::GatherAllDataSources (the problem child)

I have used profiling macros to measure the accumulated amount of time this function takes each frame. If I call Query once in my main engine game loop, the GatherAllDataSources() takes roughly 60% if not more of the entire Query method time.

You can also see from these profile results that the Octree Query is taking double the time as "Forward Plus - Frustum Culling (MESHES)" which is the brute force approach to frustum checking every mesh within the scene (the scene has 948 meshes with AABBs).

I've narrowed the issue down to the line of code with the comment below:

void GatherAllDataSources(std::vector<OctreeData>& out_data) {
    
    L_PROFILE_SCOPE_ACCUMULATIVE_TIMER("Octree Query - GatherAllDataSources"); // Accumulates a profile timer results each time this method is called. Profiler starts time on construction and stops timer and accumulates result within a ProfilerResults class.
    if (Count() == 0) {
        CheckShouldDeleteNode();
        return;
    }

    if (!m_DataSources.empty()) {
        // This is the line of code which is taking most of the queries search time
        // As you can see below aswell, the time complexity increases because 
        // I am calling this function recursively for all children, practically, 
        // gathering all data sources within this node and all children
        out_data.insert(out_data.end(), m_DataSources.begin(), m_DataSources.end()); 
    }               

    if (!IsNodeSplit()) 
        return;
        
    // Recursively gather data from child nodes
    for (const auto& child : m_ChildrenNodes) {
        if (child) {
            child->GatherAllDataSources(out_data); // Pass the same vector to avoid memory allocations
        }
    }       
}

Question Time

How can I significantly improve the efficiency of Gathering data sources recursively from my child nodes?

I am open to entirely changing the approach of how data sources are stored within the Octree, and how the overall structure of the Octree is designed, but this is where I get stuck.

I'm very inexperienced when it comes to algorithm optimisation or C++ optimisation, and as this is a new algorithm I have attempted to implement, I'm finding it very difficult to find a solution to this problem.

Any tips/tricks are welcome!

You can find the full version of my current Octree implementation code here (please note I am not finished yet with other functionality, and I will probably be back if I can't find solutions for Insert and Remove optimisation!).

Here are some resources I have reviewed:

If you're also interested in the rest of my code base it can be found on GitHub through this link. I mostly operate in the Development branch. These changes haven't been pushed yet, but I've faced a lot of challenges during this project's journey so if you have any further insights to my code or have any questions about how I've implemented different features, please give me a shout!

IEEE Spectrum
Nasir Ahmed: An Unsung Hero of Digital MediaWillie D. Jones
Stop for a second and think about the Internet without digital images or video. There would be no faces on Facebook. Instagram and TikTok probably wouldn’t exist. Those Zoom meetings that took the place of in-person gatherings for school or work during the height of the COVID-19 pandemic? Not an option.Digital audio’s place in our Internet-connected world is just as important as still images and video. It has changed the music business—from production to distribution to the way fans buy, collect
19. Srpen 2024 v 14:00

Nasir Ahmed: An Unsung Hero of Digital Media

IEEE Spectrum

Od: Willie D. Jones

19. Srpen 2024 v 14:00

Stop for a second and think about the Internet without digital images or video. There would be no faces on Facebook. Instagram and TikTok probably wouldn’t exist. Those Zoom meetings that took the place of in-person gatherings for school or work during the height of the COVID-19 pandemic? Not an option.

Digital audio’s place in our Internet-connected world is just as important as still images and video. It has changed the music business—from production to distribution to the way fans buy, collect, and store their favorite songs.

What do those millions of profiles on LinkedIn, dating apps, and social media platforms (and the inexhaustible selection of music available for download online) have in common? They rely on a compression algorithm called the discrete cosine transform, or DCT, which played a major role in allowing digital files to be transmitted across computer networks.

“DCT has been one of the key components of many past image- and video-coding algorithms for more than three decades,” says Touradj Ebrahimi, a professor at Ecole Polytechnique Fédérale de Lausanne, in Switzerland, who currently serves as chairman of the JPEG standardization committee. “Only a few image-compression standards not using DCT exist today,” he adds.

The Internet applications people use every day but largely take for granted were made possible by scientists and engineers who, for the most part, toiled in anonymity. One such “hidden figure” is Nasir Ahmed, the Indian-American engineer who figured out an elegant way to cut down the size of digital image files without sacrificing their most critical visual details.

Ahmed published his seminal paper about the discrete cosine transform compression algorithm he invented in 1974, a time when the fledgling Internet was exclusively dial-up and text-based. There were no pictures accompanying the words, nor could there have been, because Internet data was transmitted over standard copper telephone landlines, which was a major limitation on speed and bandwidth.

“Only a few image-compression standards not using DCT exist today.” –Touradj Ebrahimi, EPFL

These days, with the benefit of superfast chips and optical-fiber networks, data download speeds for a laptop with a fiber connection reach 1 gigabit per second. So, a music lover can download a 4-minute song to their laptop (or more likely a smartphone) in a second or two. In the dial-up era, when Internet users’ download speeds topped out at 56 kilobits per second (and were usually only half that fast), pulling down the same song from a server would have taken nearly all day. Getting a picture to appear on a computer’s screen was a process akin to watching grass grow.

Ahmed was convinced there had to be a way to cut down the size of digital files and speed up the process. He set off on a quest to represent with ones and zeros what is critical to an image being legible, while tossing aside the bits that are less important. The answer, which built on the earlier work of mathematician and information-theory pioneer Claude Shannon, took a while to come into focus. But because of Ahmed’s determination and unwavering belief in the value of what he was doing, he persevered even after others told him that it was not worth the effort.

Raised to Love Technology

It seemed almost preordained that Ahmed would have a career in one of the STEM fields. Nasir, who was born in Bengaluru, India, in 1940, was raised by his maternal grandparents. Ahmed’s grandfather was an electrical engineer who told him that he had been sent to the United States in 1919 to work at General Electric‘s location in Schenectady, N.Y. He shared tales of his time in the United States with his grandson and encouraged young Nasir to emigrate there. In 1961, after earning a bachelor’s degree in electrical engineering at the University of Visvesvaraya College of Engineering, in Bengaluru, Ahmed did just that, leaving India that fall for graduate school at the University of New Mexico, in Albuquerque. Ahmed earned a master’s degree and a Ph.D. in electrical engineering in 1963 and 1966, respectively.

During his first year in Albuquerque, he met Esther Parente, a graduate student from Argentina. They soon became inseparable and were married while he was working toward his doctorate. Sixty years later, they are still together.

The Seed of an Idea

In 1966, Ahmed, fresh out of grad school with his Ph.D., was hired as a principal research engineer at Honeywell’s newly created computer division. While there, Ahmed was first exposed to Walsh functions, a technique for analyzing digital representations of analog signals. The fast algorithms that could be created based on Walsh functions had many potential applications. Ahmed focused on using these signal-processing and analysis techniques to reduce the file size of a digital image without losing too much of the visual detail in the uncompressed version.

That research focus remained his primary interest when he returned to academia, taking a job as a professor in the electrical and computer engineering department at Kansas State University, in 1968.

Ahmed, like dozens of other researchers around the globe, was obsessed with finding the answer to a single question: How do you create a mathematical formula for deciphering which of the ones and zeros that represent a digital image need to be kept and which can be thrown away? The things he’d learned at Honeywell gave him a framework for understanding the elements of the problem and how to attack it. But the majority of the credit for the eventual breakthrough has to go to Ahmed’s steely determination and willingness to take a gamble on himself.

In 1972, he sought grant funding that would let him afford to spend the months between Kansas State’s spring and fall semesters furthering his ideas. He applied for a U.S. National Science Foundation grant, but was denied. Ahmed recalls the moment: “I had a strong intuition that I could find an efficient way to compress digital signal data. But to my surprise, the reviewers said the idea was too simple, so they rejected the proposal.”

Undaunted, Ahmed and his wife worked to make the salary he earned during the nine-month school year last through the summer so he could focus on his research. Money was tight, the couple recalls, but that moment of financial belt-tightening only seemed to heighten Ahmed’s industriousness. They persevered, and Ahmed’s long days and late nights in the lab eventually yielded the desired result.

DCT Compression Comes Together

Ahmed took a technique for turning the array of image-processing data representing an image’s pixels into a waveform, effectively rendering it as a series of waves with oscillating frequencies, and combined it with cosine functions that were already being used to model phenomena such as light waves, sound waves, and electric current. The result was a long string of numbers with values bounded by 1 and –1. Ahmed realized that by quantizing this string of values and performing a Fourier transformation to break the function into its constituent frequencies, each pixel’s data could be represented in a way that was helpful for deciding what data points must be kept and what could be omitted. Ahmed observed that the lower-frequency waves corresponded to the necessary or “high information” regions of the image, while the higher-frequency waves represented the bits that were less important and could therefore be approximated. The compressed-image files he and his team produced were one-tenth the size of the originals. What’s more, the process could be reversed, and a shrunken data file would yield an image that was sufficiently similar to the original.

After another two years of laborious testing, with he and his two collaborators running computer programs written on decks of data punch cards, the trio published a paper in IEEE Transactions On Computers titled “Discrete Cosine Transform” in January 1974. Though the paper’s publication did not make it immediately clear, the worldwide search for a reliable method of doing the lossy compression that Claude Shannon had postulated in the 1940s was over.

JPEGs, MPEGs, and More

It wasn’t until 1983 that the International Organization for Standardization (ISO) began working on the technology that would allow photo-quality images to accompany text on the screens of computer terminals. To that end, ISO established the Joint Photographic Experts Group, better known by the ubiquitous acronym JPEG. By the time the first JPEG standard was published in 1992, DCT and advances made by a cadre of other researchers had come to be recognized by the group as basic elements of their method for the digital compression and coding of still images. “This is the beauty of standardization, where several dozen bright minds are behind the success of advances such as JPEG,” says Ebrahimi.

And because video can be described as a succession of still images, Ahmed’s technique was also well suited to making video files smaller. DCT was the compression technique of choice when ISO and the international Electrotechnical Commission (IEC) established the Moving Picture Experts Group, or MPEG, for the compression and coding of audio, video, graphics, and genomic data in 1988. When the first MPEG standard was published in 1993, the World Wide Web that now includes Google Maps, dating apps, and e-commerce businesses was just four years old.

The ramping up of computer speeds and network bandwidth during that decade—along with the ability to transmit pictures and video via much smaller files—quickly transformed the Internet before anyone knew that Amazon would eventually let readers judge millions of books by their covers.

Having solved the problem that had monopolized his time and attention for several years, Ahmed resumed his career in academia. In 1993, the year the first MPEG standard went on the books, Ahmed left Kansas State and returned to the University of New Mexico. There he was a presidential professor of electrical and computer engineering until 1989, when he was promoted to chair of the ECE department. Five years after that, he became dean of UNM’s school of engineering. Ahmed held that post for two years until he was named associate provost for research and dean of graduate studies. He stayed in that job until he retired from the university in 2001 and was named professor emeritus.

Recent Questions - Game Development Stack Exchange
How to implement D&D 4e's line of sight algorithm?user41258
D&D 4th Edition (the tabletop game) has combat on a 2D map with square tiles. A creature occupies an entire single tile. The attacker has clear sight on the defender if lines can be drawn from one corner of the attacker's square to all four corners of the defender's square and none of these lines are blocked. The rules are as follows: To determine if a target has cover, choose a corner of your square and trace imaginary lines from that corner to every corner of the target's square. I
22. Říjen 2017 v 17:27

How to implement D&D 4e's line of sight algorithm?

Recent Questions - Game Development Stack Exchange

Od: user41258

22. Říjen 2017 v 17:27

D&D 4th Edition (the tabletop game) has combat on a 2D map with square tiles. A creature occupies an entire single tile.

The attacker has clear sight on the defender if lines can be drawn from one corner of the attacker's square to all four corners of the defender's square and none of these lines are blocked.

The rules are as follows:

To determine if a target has cover, choose a corner of your square and trace imaginary lines from that corner to every corner of the target's square. If one or two of those lines are blocked by an obstacle, the target has cover. (A line isn’t blocked if it runs along the edge of an obstacle’s or an enemy’s square.) If three or four of those lines are blocked but you have line of effect, the target has superior cover.

So, in the following situation:

A can fully see B, but C has superior cover from A (the unblocked line is from topright corner of A to topright corner of C), and A cannot see D at all.
B can fully see A, C and D.

How can I implement this?

Over the years, I have tried several solutions: some forms of Bresenham's line, testing for walls pixel by pixel, giving some tolerance around corners, and even dividing the map into line segments and comparing rays from the attacker to these line segments using a line-intersection formula. But everything either wasn't sufficiently rules-accurate or was too computationally expensive.

Can this line-of-sight algorithm be implemented efficiently (enough so that hundreds of checks may be performed for maps of 100x100 tiles per second) and accurately, and if so, how?

Recent Questions - Game Development Stack Exchange
How to decompose sprite sheetRaymond Holmboe
I have a lot of spritesheets that are poorly formatted that I want to decompose, or split out into many small images, one for each sprite. If I can do that, I can use my custom texture packer tool to build my game assets with. My development tools are XNA and C# targetting Windows. How can I decompose the images?
11. Září 2012 v 16:52

How to decompose sprite sheet

Recent Questions - Game Development Stack Exchange

Od: Raymond Holmboe

11. Září 2012 v 16:52

I have a lot of spritesheets that are poorly formatted that I want to decompose, or split out into many small images, one for each sprite. If I can do that, I can use my custom texture packer tool to build my game assets with.

My development tools are XNA and C# targetting Windows. How can I decompose the images?

Normální zobrazení

Brief

Structure

Problem

Question Time

Raised to Love Technology

The Seed of an Idea

DCT Compression Comes Together

JPEGs, MPEGs, and More