FreshRSS

Normální zobrazení

Jsou dostupné nové články, klikněte pro obnovení stránky.
PředevčíremHlavní kanál
  • ✇Recent Questions - Game Development Stack Exchange
  • Relation between an object and its rendererX Y
    Recently I decided to start optimizing code in the game development field. I found this pacman project on github a good starting point: https://github.com/LucaFeggi/PacMan_SDL/tree/main I found a lot of things that deserve an optimization. One of the most things that caught my attention is that a lot of classes have their Draw (render) method inside them, such as Pac, Ghost, Fruit classes... So I decided to make a Renderer class, and move to it the Draw method and the appropriate fields that the
     

Relation between an object and its renderer

Recently I decided to start optimizing code in the game development field.

I found this pacman project on github a good starting point: https://github.com/LucaFeggi/PacMan_SDL/tree/main

I found a lot of things that deserve an optimization. One of the most things that caught my attention is that a lot of classes have their Draw (render) method inside them, such as Pac, Ghost, Fruit classes...

So I decided to make a Renderer class, and move to it the Draw method and the appropriate fields that the latter works on.

Let us take the Pac class inside the PacRenderer.hpp as an example, to clarify things:

class Pac : public Entity{
    public:
        Pac();
        ~Pac();
        void UpdatePos(std::vector<unsigned char> &mover, unsigned char ActualMap[]);
        unsigned char FoodCollision(unsigned char ActualMap[]);
        bool IsEnergized();
        void ChangeEnergyStatus(bool NewEnergyStatus);
        void SetFacing(unsigned char mover);
        bool IsDeadAnimationEnded();
        void ModDeadAnimationStatement(bool NewDeadAnimationStatement);
        void UpdateCurrentLivingPacFrame();
        void ResetCurrentLivingFrame();
        void WallCollisionFrame();
        void Draw();
    private:
        LTexture LivingPac;
        LTexture DeathPac;
        SDL_Rect LivingPacSpriteClips[LivingPacFrames];
        SDL_Rect DeathPacSpriteClips[DeathPacFrames];
        unsigned char CurrLivingPacFrame;
        unsigned char CurrDeathPacFrame;
        bool EnergyStatus;
        bool DeadAnimationStatement;
};

So according to what I described above, the PacRenderer class will be:

class PacRenderer {
public:
        PacRenderer(std::shared_ptr<Pac> pac);
        void Draw();
private:
        LTexture LivingPac;
        LTexture DeathPac;
        SDL_Rect LivingPacSpriteClips[LivingPacFrames];
        SDL_Rect DeathPacSpriteClips[DeathPacFrames];
}

PacRenderer::PacRenderer(std::shared_ptr<Pac> pac)
{
    this->pac = pac;

    //loading textures here instead of in the Fruit class
    LivingPac.loadFromFile("Textures/PacMan32.png");
    DeathPac.loadFromFile("Textures/GameOver32.png");
    InitFrames(LivingPacFrames, LivingPacSpriteClips);
    InitFrames(DeathPacFrames, DeathPacSpriteClips);
}

After moving Draw() and 4 fields from the Pac class.

At the beginning, I thought that it is a very good optimization, and separation between game logic and graphics rendering.

Until I read this question and its right answer: How should renderer class relate to the class it renders?

I found that what the right answer describes as

you're just taking a baby step away from "objects render themselves" rather than moving all the way to an approach where something else renders the objects

applies exactly to my solution.

So I want to verify if this description really apply to my solution.

Is my solution good, did I actually separate game logic from graphics rendering?

If not, what is the best thing to do in this case?

GDC 2024 – GPU Work Graphs: Welcome to the Future of GPU Programming – YouTube link

Od: Tom Lewis
20. Červen 2024 v 16:49

AMD GPUOpen - Graphics and game developer resources

This talk by Matthäus Chajdas gives a first look at this brand new API and shows how it's going to transform the GPU programming landscape going forward.

The post GDC 2024 – GPU Work Graphs: Welcome to the Future of GPU Programming – YouTube link appeared first on AMD GPUOpen.

  • ✇IEEE Spectrum
  • Hybrid Bonding Plays Starring Role in 3D ChipsSamuel K. Moore
    Chipmakers continue to claw for every spare nanometer to continue scaling down circuits, but a technology involving things that are much bigger—hundreds or thousands of nanometers across—could be just as significant over the next five years. Called hybrid bonding, that technology stacks two or more chips atop one another in the same package. That allows chipmakers to increase the number of transistors in their processors and memories despite a general slowdown in the shrinking of transistors,
     

Hybrid Bonding Plays Starring Role in 3D Chips

11. Srpen 2024 v 15:00


Chipmakers continue to claw for every spare nanometer to continue scaling down circuits, but a technology involving things that are much bigger—hundreds or thousands of nanometers across—could be just as significant over the next five years.

Called hybrid bonding, that technology stacks two or more chips atop one another in the same package. That allows chipmakers to increase the number of transistors in their processors and memories despite a general slowdown in the shrinking of transistors, which once drove Moore’s Law. At the IEEE Electronic Components and Technology Conference (ECTC) this past May in Denver, research groups from around the world unveiled a variety of hard-fought improvements to the technology, with a few showing results that could lead to a record density of connections between 3D stacked chips: some 7 million links per square millimeter of silicon.

All those connections are needed because of the new nature of progress in semiconductors, Intel’s Yi Shi told engineers at ECTC. Moore’s Law is now governed by a concept called system technology co-optimization, or STCO, whereby a chip’s functions, such as cache memory, input/output, and logic, are fabricated separately using the best manufacturing technology for each. Hybrid bonding and other advanced packaging tech can then be used to assemble these subsystems so that they work every bit as well as a single piece of silicon. But that can happen only when there’s a high density of connections that can shuttle bits between the separate pieces of silicon with little delay or energy consumption.

Out of all the advanced-packaging technologies, hybrid bonding provides the highest density of vertical connections. Consequently, it is the fastest growing segment of the advanced-packaging industry, says Gabriella Pereira, technology and market analyst at Yole Group. The overall market is set to more than triple to US $38 billion by 2029, according to Yole, which projects that hybrid bonding will make up about half the market by then, although today it’s just a small portion.

In hybrid bonding, copper pads are built on the top face of each chip. The copper is surrounded by insulation, usually silicon oxide, and the pads themselves are slightly recessed from the surface of the insulation. After the oxide is chemically modified, the two chips are then pressed together face-to-face, so that the recessed pads on each align. This sandwich is then slowly heated, causing the copper to expand across the gap and fuse, connecting the two chips.

Making Hybrid Bonding Better


An illustration showing how to make hybrid bonding better
  1. Hybrid bonding starts with two wafers or a chip and a wafer facing each other. The mating surfaces are covered in oxide insulation and slightly recessed copper pads connected to the chips’ interconnect layers.
  2. The wafers are pressed together to form an initial bond between the oxides.
  3. The stacked wafers are then heated slowly, strongly linking the oxides and expanding the copper to form an electrical connection.
  1. To form more secure bonds, engineers are flattening the last few nanometers of oxide. Even slight bulges or warping can break dense connections.
  2. The copper must be recessed from the surface of the oxide just the right amount. Too much and it will fail to form a connection. Too little and it will push the wafers apart. Researchers are working on ways to control the level of copper down to single atomic layers.
  3. The initial links between the wafers are weak hydrogen bonds. After annealing, the links are strong covalent bonds [below]. Researchers expect that using different types of surfaces, such as silicon carbonitride, which has more locations to form chemical bonds, will lead to stronger links between the wafers.
  4. The final step in hybrid bonding can take hours and require high temperatures. Researchers hope to lower the temperature and shorten the process time.
  5. Although the copper from both wafers presses together to form an electrical connection, the metal’s grain boundaries generally do not cross from one side to the other. Researchers are trying to cause large single grains of copper to form across the boundary to improve conductance and stability.

Hybrid bonding can either attach individual chips of one size to a wafer full of chips of a larger size or bond two full wafers of chips of the same size. Thanks in part to its use in camera chips, the latter process is more mature than the former, Pereira says. For example, engineers at the European microelectronics-research institute Imec have created some of the most dense wafer-on-wafer bonds ever, with a bond-to-bond distance (or pitch) of just 400 nanometers. But Imec managed only a 2-micrometer pitch for chip-on-wafer bonding.

The latter is a huge improvement over the advanced 3D chips in production today, which have connections about 9 μm apart. And it’s an even bigger leap over the predecessor technology: “microbumps” of solder, which have pitches in the tens of micrometers.

“With the equipment available, it’s easier to align wafer to wafer than chip to wafer. Most processes for microelectronics are made for [full] wafers,” says Jean-Charles Souriau, scientific leader in integration and packaging at the French research organization CEA Leti. But it’s chip-on-wafer (or die-to-wafer) that’s making a splash in high-end processors such as those from AMD, where the technique is used to assemble compute cores and cache memory in its advanced CPUs and AI accelerators.

In pushing for tighter and tighter pitches for both scenarios, researchers are focused on making surfaces flatter, getting bound wafers to stick together better, and cutting the time and complexity of the whole process. Getting it right could revolutionize how chips are designed.

WoW, Those Are Some Tight Pitches

The recent wafer-on-wafer (WoW) research that achieved the tightest pitches—from 360 nm to 500 nm—involved a lot of effort on one thing: flatness. To bond two wafers together with 100-nm-level accuracy, the whole wafer has to be nearly perfectly flat. If it’s bowed or warped to the slightest degree, whole sections won’t connect.

Flattening wafers is the job of a process called chemical mechanical planarization, or CMP. It’s essential to chipmaking generally, especially for producing the layers of interconnects above the transistors.

“CMP is a key parameter we have to control for hybrid bonding,” says Souriau. The results presented at ECTC show CMP being taken to another level, not just flattening across the wafer but reducing mere nanometers of roundness on the insulation between the copper pads to ensure better connections.

“It’s difficult to say what the limit will be. Things are moving very fast.” —Jean-Charles Souriau, CEA Leti

Other researchers focused on ensuring those flattened parts stick together strongly enough. They did so by experimenting with different surface materials such as silicon carbonitride instead of silicon oxide and by using different schemes to chemically activate the surface. Initially, when wafers or dies are pressed together, they are held in place with relatively weak hydrogen bonds, and the concern is whether everything will stay in place during further processing steps. After attachment, wafers and chips are then heated slowly, in a process called annealing, to form stronger chemical bonds. Just how strong these bonds are—and even how to figure that out—was the subject of much of the research presented at ECTC.

Part of that final bond strength comes from the copper connections. The annealing step expands the copper across the gap to form a conductive bridge. Controlling the size of that gap is key, explains Samsung’s Seung Ho Hahn. Too little expansion, and the copper won’t fuse. Too much, and the wafers will be pushed apart. It’s a matter of nanometers, and Hahn reported research on a new chemical process that he hopes to use to get it just right by etching away the copper a single atomic layer at a time.

The quality of the connection counts, too. The metals in chip interconnects are not a single crystal; instead they’re made up of many grains, crystals oriented in different directions. Even after the copper expands, the metal’s grain boundaries often don’t cross from one side to another. Such a crossing should reduce a connection’s electrical resistance and boost its reliability. Researchers at Tohoku University in Japan reported a new metallurgical scheme that could finally generate large, single grains of copper that cross the boundary. “This is a drastic change,” says Takafumi Fukushima, an associate professor at Tohoku. “We are now analyzing what underlies it.”

Other experiments discussed at ECTC focused on streamlining the bonding process. Several sought to reduce the annealing temperature needed to form bonds—typically around 300 °C—as to minimize any risk of damage to the chips from the prolonged heating. Researchers from Applied Materials presented progress on a method to radically reduce the time needed for annealing—from hours to just 5 minutes.

CoWs That Are Outstanding in the Field

A series of gray-scale images of the corner of an object at increasing magnification. Imec used plasma etching to dice up chips and give them chamfered corners. The technique relieves mechanical stress that could interfere with bonding.Imec

Chip-on-wafer (CoW) hybrid bonding is more useful to makers of advanced CPUs and GPUs at the moment: It allows chipmakers to stack chiplets of different sizes and to test each chip before it’s bound to another, ensuring that they aren’t dooming an expensive CPU with a single flawed part.

But CoW comes with all of the difficulties of WoW and fewer of the options to alleviate them. For example, CMP is designed to flatten wafers, not individual dies. Once dies have been cut from their source wafer and tested, there’s less that can be done to improve their readiness for bonding.

Nevertheless, researchers at Intel reported CoW hybrid bonds with a 3-μm pitch, and, as mentioned, a team at Imec managed 2 μm, largely by making the transferred dies very flat while they were still attached to the wafer and keeping them extra clean throughout the process. Both groups used plasma etching to dice up the dies instead of the usual method, which uses a specialized blade. Unlike a blade, plasma etching doesn’t lead to chipping at the edges, which creates debris that could interfere with connections. It also allowed the Imec group to shape the die, making chamfered corners that relieve mechanical stress that could break connections.

CoW hybrid bonding is going to be critical to the future of high-bandwidth memory (HBM), according to several researchers at ECTC. HBM is a stack of DRAM dies—currently 8 to 12 dies high—atop a control-logic chip. Often placed within the same package as high-end GPUs, HBM is crucial to handling the tsunami of data needed to run large language models like ChatGPT. Today, HBM dies are stacked using microbump technology, so there are tiny balls of solder surrounded by an organic filler between each layer.

But with AI pushing memory demand even higher, DRAM makers want to stack 20 layers or more in HBM chips. The volume that microbumps take up means that these stacks will soon be too tall to fit properly in the package with GPUs. Hybrid bonding would shrink the height of HBMs and also make it easier to remove excess heat from the package, because there would be less thermal resistance between its layers.

“I think it’s possible to make a more-than-20-layer stack using this technology.” —Hyeonmin Lee, Samsung

At ECTC, Samsung engineers showed that hybrid bonding could yield a 16-layer HBM stack. “I think it’s possible to make a more-than-20-layer stack using this technology,” says Hyeonmin Lee, a senior engineer at Samsung. Other new CoW technology could also help bring hybrid bonding to high-bandwidth memory. Researchers at CEA Leti are exploring what’s known as self-alignment technology, says Souriau. That would help ensure good CoW connections using just chemical processes. Some parts of each surface would be made hydrophobic and some hydrophilic, resulting in surfaces that would slide into place automatically.

At ECTC, researchers from Tohoku University and Yamaha Robotics reported work on a similar scheme, using the surface tension of water to align 5-μm pads on experimental DRAM chips with better than 50-nm accuracy.

The Bounds of Hybrid Bonding

Researchers will almost certainly keep reducing the pitch of hybrid-bonding connections. A 200-nm WoW pitch is not just possible but desirable, Han-Jong Chia, a project manager for pathfinding systems at Taiwan Semiconductor Manufacturing Co. , told engineers at ECTC. Within two years, TSMC plans to introduce a technology called backside power delivery. (Intel plans the same for the end of this year.) That’s a technology that puts the chip’s chunky power-delivery interconnects below the surface of the silicon instead of above it. With those power conduits out of the way, the uppermost levels can connect better to smaller hybrid-bonding bond pads, TSMC researchers calculate. Backside power delivery with 200-nm bond pads would cut down the capacitance of 3D connections so much that a measure of energy efficiency and signal speed would be as much as eight times better than what can be achieved with 400-nm bond pads.

Black squares dot most of the top of an orange metallic disc. Chip-on-wafer hybrid bonding is more useful than wafer-on-wafer bonding, in that it can place dies of one size onto a wafer of larger dies. However, the density of connections that can be achieved is lower than for wafer-on-wafer bonding.Imec

At some point in the future, if bond pitches narrow even further, Chia suggests, it might become practical to “fold” blocks of circuitry so they are built across two wafers. That way some of what are now long connections within the block might be able to take a vertical shortcut, potentially speeding computations and lowering power consumption.

And hybrid bonding may not be limited to silicon. “Today there is a lot of development in silicon-to-silicon wafers, but we are also looking to do hybrid bonding between gallium nitride and silicon wafers and glass wafers…everything on everything,” says CEA Leti’s Souriau. His organization even presented research on hybrid bonding for quantum-computing chips, which involves aligning and bonding superconducting niobium instead of copper.

“It’s difficult to say what the limit will be,” Souriau says. “Things are moving very fast.”

This article was updated on 11 August 2024.

This article appears in the September 2024 print issue as “The Copper Connection.”

Must-Have Free Software for Optimal PC Performance

11. Květen 2024 v 09:52
PC Software

There are many different PC software available now. Some you have to pay for, but others are free, or they have free versions. You can ...

The post Must-Have Free Software for Optimal PC Performance appeared first on Gizchina.com.

Merging Power and Arithmetic Optimization Via Datapath Rewriting (Intel, Imperial College London)

A new technical paper titled “Combining Power and Arithmetic Optimization via Datapath Rewriting” was published by researchers at Intel Corporation and Imperial College London.

Abstract:
“Industrial datapath designers consider dynamic power consumption to be a key metric. Arithmetic circuits contribute a major component of total chip power consumption and are therefore a common target for power optimization. While arithmetic circuit area and dynamic power consumption are often correlated, there is also a tradeoff to consider, as additional gates can be added to explicitly reduce arithmetic circuit activity and hence reduce power consumption. In this work, we consider two forms of power optimization and their interaction: circuit area reduction via arithmetic optimization, and the elimination of redundant computations using both data and clock gating. By encoding both these classes of optimization as local rewrites of expressions, our tool flow can simultaneously explore them, uncovering new opportunities for power saving through arithmetic rewrites using the e-graph data structure. Since power consumption is highly dependent upon the workload performed by the circuit, our tool flow facilitates a data dependent design paradigm, where an implementation is automatically tailored to particular contexts of data activity. We develop an automated RTL to RTL optimization framework, ROVER, that takes circuit input stimuli and generates power-efficient architectures. We evaluate the effectiveness on both open-source arithmetic benchmarks and benchmarks derived from Intel production examples. The tool is able to reduce the total power consumption by up to 33.9%.”

Find the technical paper here. Published April 2024.

Samuel Coward, Theo Drane, Emiliano Morini, George Constantinides; arXiv:2404.12336v1.

The post Merging Power and Arithmetic Optimization Via Datapath Rewriting (Intel, Imperial College London) appeared first on Semiconductor Engineering.

Merging Power and Arithmetic Optimization Via Datapath Rewriting (Intel, Imperial College London)

A new technical paper titled “Combining Power and Arithmetic Optimization via Datapath Rewriting” was published by researchers at Intel Corporation and Imperial College London.

Abstract:
“Industrial datapath designers consider dynamic power consumption to be a key metric. Arithmetic circuits contribute a major component of total chip power consumption and are therefore a common target for power optimization. While arithmetic circuit area and dynamic power consumption are often correlated, there is also a tradeoff to consider, as additional gates can be added to explicitly reduce arithmetic circuit activity and hence reduce power consumption. In this work, we consider two forms of power optimization and their interaction: circuit area reduction via arithmetic optimization, and the elimination of redundant computations using both data and clock gating. By encoding both these classes of optimization as local rewrites of expressions, our tool flow can simultaneously explore them, uncovering new opportunities for power saving through arithmetic rewrites using the e-graph data structure. Since power consumption is highly dependent upon the workload performed by the circuit, our tool flow facilitates a data dependent design paradigm, where an implementation is automatically tailored to particular contexts of data activity. We develop an automated RTL to RTL optimization framework, ROVER, that takes circuit input stimuli and generates power-efficient architectures. We evaluate the effectiveness on both open-source arithmetic benchmarks and benchmarks derived from Intel production examples. The tool is able to reduce the total power consumption by up to 33.9%.”

Find the technical paper here. Published April 2024.

Samuel Coward, Theo Drane, Emiliano Morini, George Constantinides; arXiv:2404.12336v1.

The post Merging Power and Arithmetic Optimization Via Datapath Rewriting (Intel, Imperial College London) appeared first on Semiconductor Engineering.

GDC 2024: We reveal incredible Work Graphs perf, AMD FSR 3.1, GI with Brixelizer, and so much more

22. Březen 2024 v 17:00

AMD GPUOpen - Graphics and game developer resources

Learn about our GDC 2024 activities, including AMD FSR 3.1, AMD FidelityFX Brixelizer, work graphs, mesh shaders, tools, CPU, and more.

The post GDC 2024: We reveal incredible Work Graphs perf, AMD FSR 3.1, GI with Brixelizer, and so much more appeared first on AMD GPUOpen.

Google Chrome begins testing shared dictionary compression technology

Od: Efe Udin
8. Březen 2024 v 20:41
Chrome extensions

Google Chrome, a leading web browser, is at the forefront of testing innovative technologies to enhance web performance. One such advancement is the implementation of ...

The post Google Chrome begins testing shared dictionary compression technology appeared first on Gizchina.com.

Make The Impossible Possible: Use Variable-Shaped Beam Mask Writers And Curvilinear Full-Chip Inverse Lithography Technology For 193i Contacts/Vias With Mask-Wafer Co-Optimization

22. Únor 2024 v 09:01

Abstract:

“Full-chip curvilinear inverse lithography technology (ILT) requires mask writers to write full reticle curvilinear mask patterns in a reasonable write time. We jointly study and present the benefits of a full-chip, curvilinear, stitchless ILT with mask-wafer co-optimization (MWCO) for variable-shaped beam (VSB) mask writers and validate its benefits on mask and wafer at Micron Technology. The full-chip ILT technology employed, first demonstrated in a paper presented at the 2019 SPIE Photomask Technology Conference, produces curvilinear ILT mask patterns without stitching errors, and with process windows enlarged by over 100% compared to the OPC process of record, while the mask was written by multibeam mask writer. At the 2020 SPIE Advanced Lithography Conference, a method was introduced in which MWCO is performed during ILT optimization. This approach enables curvilinear ILT for 193i masks to be written on VSB mask writers within a practical, 12-h time frame, while also producing the largest process windows. We first review MWCO technology, then curvilinear ILT mask patterns written by VSB mask writer, and then show the corresponding 193i process wafer prints. Evaluations of mask write times and mask quality in terms of critical dimension uniformity and process windows are also presented.”

Find the technical paper here. Published February 2024.

Pang, Linyong, Sha Lu, Ezequiel Vidal Russell, Yang Lu, Michael Lee, Jennefir Digaum, Ming-Chuan Yang et al. “Make the impossible possible: use variable-shaped beam mask writers and curvilinear full-chip inverse lithography technology for 193i contacts/vias with mask-wafer co-optimization.” Journal of Micro/Nanopatterning, Materials, and Metrology 23, no. 1 (2024): 011207-011207.

Author Affiliations: D2S Inc. and Micron Technology Inc.

The post Make The Impossible Possible: Use Variable-Shaped Beam Mask Writers And Curvilinear Full-Chip Inverse Lithography Technology For 193i Contacts/Vias With Mask-Wafer Co-Optimization appeared first on Semiconductor Engineering.

  • ✇AMD GPUOpen
  • Optimization and best practices – Mesh shaders on RDNA™ graphics cardsmesh_shaders
    AMD GPUOpen - Graphics and game developer resources The second post in this series on mesh shaders covers best practices for writing mesh and amplification shaders, as well as how to use the AMD Radeon™ Developer Tool Suite to profile and optimize mesh shaders. The post Optimization and best practices – Mesh shaders on RDNA™ graphics cards appeared first on AMD GPUOpen.
     

Optimization and best practices – Mesh shaders on RDNA™ graphics cards

16. Leden 2024 v 11:29

AMD GPUOpen - Graphics and game developer resources

The second post in this series on mesh shaders covers best practices for writing mesh and amplification shaders, as well as how to use the AMD Radeon™ Developer Tool Suite to profile and optimize mesh shaders.

The post Optimization and best practices – Mesh shaders on RDNA™ graphics cards appeared first on AMD GPUOpen.

  • ✇AMD GPUOpen
  • CPU profiling for UnityGPUOpen
    AMD GPUOpen - Graphics and game developer resources This is a general guide focusing on CPU profiling for Unity, including which tools are useful for profiling and how to use these tools to find hotspots in your code. The post CPU profiling for Unity appeared first on AMD GPUOpen.
     

CPU profiling for Unity

Od: GPUOpen
5. Leden 2024 v 16:56

AMD GPUOpen - Graphics and game developer resources

This is a general guide focusing on CPU profiling for Unity, including which tools are useful for profiling and how to use these tools to find hotspots in your code.

The post CPU profiling for Unity appeared first on AMD GPUOpen.

  • ✇AMD GPUOpen
  • Unreal Engine performance guideGPUOpen
    AMD GPUOpen - Graphics and game developer resources Our one-stop guide to performance with Unreal Engine. The post Unreal Engine performance guide appeared first on AMD GPUOpen.
     
❌
❌