game engine

What It Took to Finally Stream Assets Remotely in the Untold Engine

One of the features I wanted to add to the Untold Engine for a long time was remote asset streaming.

Streaming a Jet using the Untold Engine

The challenge wasn’t the networking part. The issue was that the engine assumed everything was loaded, available, and resident in memory at all times. That model doesn’t work once you introduce streaming.

So instead of jumping straight into it, I focused on building the systems that would make it possible.

This is a breakdown of the systems I had to implement before remote streaming finally worked — including on Vision Pro.


Batching System

The first thing I had to fix was draw calls.

Large scenes (especially architectural ones) quickly became CPU-bound because every mesh resulted in a draw call. The GPU was fine, but the CPU couldn’t keep up.

Batching helped reduce the number of draw calls by grouping meshes together.

But this introduced a constraint: once meshes are batched, you lose flexibility. You can’t easily move or unload individual pieces anymore.

This forced me to think more about how meshes are grouped, not just rendered.


LOD System

After batching, the next issue was rendering too much detail.

Even if draw calls were under control, I was still pushing too many vertices and fragments.

The LOD system allowed the engine to swap meshes based on distance. That helped performance, but more importantly, it introduced the idea that not everything needs to be at full quality all the time.

This was the first step toward selective rendering.


Geometry Streaming

Up to this point, everything was still loaded at startup.

That doesn’t scale.

The Geometry Streaming system allowed the engine to load and unload meshes dynamically. This changed several assumptions:

  • Meshes might not exist when requested
  • Systems need to handle missing data
  • Rendering depends on availability

This is where the engine stopped being static.


Mesh Resource Manager

Once streaming was introduced, I needed a system to manage it.

The Mesh Resource Manager became responsible for tracking loaded meshes, handling GPU buffers, and making sure the same asset isn’t loaded multiple times.

Without this, things get messy fast. You end up duplicating data or unloading things that are still in use.

This system made ownership clear.


Streaming a Formula 1 car using the Untold Engine.

Memory Budget Manager

Streaming only works if you enforce limits.

The Memory Budget Manager sets a fixed budget and ensures the engine stays within it. When memory usage gets too high, assets need to be evicted.

This introduced a new kind of problem: deciding what to remove.

The engine now has to constantly answer:

What is safe to unload right now?

This is especially important for Vision Pro, where memory constraints are much tighter.


Tile Streaming

Even with streaming in place, I ran into another issue.

Some meshes were just too big.

For example, a single mesh could represent a large portion of a building. That makes it difficult to stream efficiently, because you either load the whole thing or nothing.

The solution was to break the scene into tiles.

I used a Blender pipeline to partition scenes (eventually using a quadtree). Each tile represents a localized part of the world.

Now the engine can:

  • Load only what’s near the camera
  • Avoid loading interiors when outside
  • Stream data in smaller chunks

This made a big difference for large scenes.


Native Asset Format (.untold)

At this point, most of the systems were in place, but performance still wasn’t where it needed to be.

The main issue was parsing.

Using USDZ at runtime introduced overhead:

  • CPU parsing cost
  • Memory spikes
  • Indirect data layouts

So I introduced a native format: .untold

This format is built for runtime:

  • Data is preprocessed
  • GPU upload is direct
  • Layout is streaming-friendly

USDZ is still useful as an input format, but it’s not ideal for real-time streaming.


Remote Streaming

Once all of the above systems were working, remote streaming became much simpler.

The engine already knew how to:

  • Load assets on demand
  • Stay within memory limits
  • Stream tiles based on camera position

At that point, the only change was the source of the data.

Instead of reading from disk, the engine now fetches assets over the network.

And it works — including on Vision Pro.

Below is a short clip of the Streaming System in action.


Compression (LZ4 + ASTC)

After getting remote streaming working, another issue showed up quickly: memory usage, especially from textures.

Some scenes would crash on Vision Pro due to high texture memory consumption. Even if geometry was under control, textures alone could push the system over the limit.

To address this, I integrated two forms of compression into the pipeline.

For asset streaming, I added LZ4 compression. This helps reduce the size of data being transferred and improves load times when streaming assets from a remote source. Since LZ4 is fast to decompress, it fits well into a real-time pipeline.

For textures, I integrated ASTC compression.

ASTC significantly reduces GPU memory usage while maintaining good visual quality. This made a noticeable difference on Vision Pro, where memory constraints are tighter and large uncompressed textures can quickly become a problem.

With ASTC in place:

  • Texture memory footprint is much lower
  • Scenes that previously crashed can now load
  • Streaming becomes more stable overall

At this point, compression is no longer optional. It’s part of making the system work reliably on constrained devices.


Final Thoughts

What started as a goal to stream assets remotely ended up requiring changes across the entire engine.

The biggest shift was this:

Before:

  • Everything is loaded
  • Everything is available

After:

  • Only what’s needed is loaded
  • Everything else is optional

Streaming isn’t something you add at the end. The engine has to be designed around it.

Thanks again to everyone supporting the Untold Engine on GitHub. This wouldn’t have been possible without that support.

Lessons Learned: When the Drawable Leaks Into Your Render Pipeline

This week, while rendering scenes in Vision Pro using the Untold Engine, I realized that scenes were being rendered with the incorrect color space. Well, initially, I thought it was a color space issue — but something was telling me that this was more than just a color space problem.

After analyzing my render graph and verifying the color targets I was using in the lighting pass and tone mapping pass, I realized that I had made a crucial mistake in the engine.

See, my lighting pass was doing all calculations in linear space, which is correct. However, the internal render targets were being created using the drawable's pixel format. Doing so meant that every platform could change the precision, dynamic range, and even encoding behavior of my internal buffers.

In other words, my lighting results were being stored in formats dictated by the drawable’s target format. That is wrong. The renderer should own its internal formats — not the presentation layer.

Because the drawable format differs per platform (for example, .bgra8Unorm_srgb on Vision Pro), my internal render targets were sometimes:

  • 8-bit
  • sRGB-encoded
  • Not HDR-capable

Even though my lighting calculations were done in linear space, the storage format altered how those results were preserved and interpreted.

So yes — the math was linear, but the buffers holding the results were not consistent across platforms.

That is where the mismatch came from.

To fix this, I explicitly set the color target used in the lighting pass to rgba16Float. By doing this, I ensured:

  • Stable precision
  • HDR-capable storage
  • Linear behavior
  • Platform-independent results

Now, my lighting calculations are identical regardless of the platform, because the internal render targets are explicitly defined by the engine — not by the drawable.


The Second Issue: Tone Mapping Is Not Output Encoding

The other issue was more subtle and made me realize that I still have a lot more to learn about tone mapping.

My pipeline originally followed this path:

  • Lighting Pass
  • Post Processing
  • Tone Mapping
  • Write to Drawable

The problem with this flow was that I assumed that after tone mapping, the image was ready for the screen.

But that is not true.

Different platforms expect different things:

  • Different pixel formats (RGBA vs BGRA)
  • Different encoding (linear vs sRGB)
  • Different gamuts (sRGB vs Display-P3)
  • Different dynamic range behavior (SDR vs EDR)

My pipeline above implicitly assumed that the tone-mapped result already matched whatever the drawable expected.

But tone mapping does not mean “ready for any screen.”

Tone mapping only compresses HDR → display-referred brightness range. It does not:

  • Encode to sRGB automatically
  • Convert color gamut
  • Match the drawable’s storage format
  • Handle EDR behavior

So when I wrote directly to the drawable after tone mapping, I was essentially letting the platform decide how the final color should be interpreted.

And since platforms differ, my final image differed.


What Was I Missing?

I needed to separate responsibilities more clearly.

I needed a pass that owned the creative look — fully defined and controlled by the engine:

  • Exposure
  • White balance
  • Contrast
  • Tone mapping curve

This defines how the image should look artistically.

And I needed a separate pass that is platform-aware — an Output Transform pass — that defines how the display expects pixels to be formatted:

  • Encode to sRGB or not
  • Convert to P3 or not
  • Clamp or preserve HDR
  • BGRA vs RGBA channel order
  • EDR behavior

In my original pipeline, I had collapsed Look + Output Transform into one step. I wasn’t explicitly controlling the final encoding, so the platform’s defaults influenced the final image.

With the extra passes and modifications I made, the Look pass now defines the artistic look of the image. The Output Transform defines how that look is encoded for a specific display.

Previously, I was conflating the two — which allowed the platform’s drawable format to influence the final result.

Here is the image after the fix.

After fix image

Now, the renderer owns the working color space and internal formats, and the drawable only affects the final presentation step.

Thanks for reading.

Lessons Learned: Vision Pro, Large Scenes, and Threading

This week I came across an unexpected issue. Loading a large scene on the Vision Pro would result in a run-time error. But loading the same scene on macOS did not. Both macOS and visionOS use essentially the same loading and rendering path. So what could be causing the issue?

To make things worse, the run-time errors were cryptic and pointed to different functions each time the application crashed. Sometimes it looked like a type issue. Other times it looked like a Foundation error. Nothing clearly indicated what the real problem was.

However, during debugging, I started to see a pattern. Most of the time, the error pointed to scene or component data access. That’s when I began wondering:

What if, during loading, the system is trying to access data that is not yet stable?

In other words, what if one part of the engine is writing to components while another part is reading from them?

Then another thought came to mind. What if this issue is also present on macOS, but because macOS does not use a dedicated render thread in the same way as Vision Pro, the race condition is simply not exposed?

After a few more debugging sessions, I realized I may have been onto something.


The Real Issue

On Vision Pro, rendering runs on a dedicated render thread. When we load a large scene, the Untold Engine performs loading work on a separate thread so that we don’t block execution.

That means we had two things happening at the same time:

  • The render thread traversing the scene graph, iterating component storage, performing culling, and building draw calls.
  • The loading thread creating entities, attaching components, recursively tagging entities for static batching, rebuilding batch data, and updating spatial structures such as the octree.

In other words, the render thread was reading from scene/component data while the loading thread was writing to that same data.

This read/write overlap caused race conditions and eventually corrupted state.


Why This Did Not Happen on macOS

The reason this did not happen on macOS is mostly due to timing and threading differences.

On macOS:

  • The renderer and update loop are more tightly coupled.
  • The mutation window during loading is smaller.
  • The render traversal is less likely to intersect with scene mutation at the exact wrong moment.

On Vision Pro:

  • Rendering runs independently on a dedicated thread.
  • Frame submission follows its own cadence.
  • The renderer can traverse the scene while it is still being mutated.

Large scenes amplify this issue because static batching and recursive hierarchy processing take longer, increasing the window where the world is in a partially mutated state.


The Solution

The solution was to add a gating mechanism to prevent any read/write collision while loading was taking place.

The idea is simple:

  • When a major scene mutation phase begins (for example, during large scene loading or static batch generation), increment a shared counter.
  • When the mutation phase finishes, decrement it.
  • The render thread checks this counter before traversing the scene.
  • If a mutation is in progress, the render thread continues to submit frames but avoids traversing scene or component data that may still be unstable.

It’s important to note that I do not block the render thread on visionOS. I let it continue running, but I prevent it from accessing critical scene data while the loading phase is still mutating that data.

After this fix was in place, loading large scenes with the Untold Engine on Vision Pro no longer caused run-time crashes.


Final Thoughts

In the end, the issue was about concurrency.

  • Rendering reads from the world.
  • Loading mutates the world.

Without proper synchronization, those two operations cannot safely overlap.

Vision Pro didn’t introduce a new bug into the Untold Engine. It exposed a hidden assumption in my threading model.

And that’s a good thing. Thanks for reading.

Lessons Learned While Adding Geometry Streaming

This week I worked on adding Geometry Streaming to the engine and fixed a flickering issue that had been quietly annoying me for a while.

Both tasks ended up being more related than I initially expected.

BTW, here is version 0.10.0 of the engine with the Geometry Streaming Support.


Geometry Streaming Wasn’t the Hard Part — Integration Was

Getting Geometry Streaming working on its own wasn’t too bad. The goal was simple enough: render large scenes without having to load the entire scene into VRAM during initialization. Instead, meshes should be loaded and unloaded on demand, without stalling rendering.

The part that caused friction was not streaming itself, but getting it to behave correctly alongside two existing systems:

  • the LOD system
  • the static batching system

Each of these systems already worked well in isolation. The instability showed up once they had to coexist.

I initially overcomplicated the problem, mostly because I was treating these systems as if they were peers operating at the same level. They’re not.


The Assumption That Broke Everything

The thing that finally made it click was realizing that these systems don’t negotiate with each other — they react to upstream state.

Once I stopped thinking of them as equals and instead thought of them as layers in a pipeline, the engine immediately became more predictable.

A stable frame ended up looking like this:

  • Geometry streaming updates asset residency
  • LOD selection picks the best available representation
  • Static batching groups the selected meshes
  • The renderer submits batches to the GPU

Once I enforced this flow in the update loop, a surprising number of bugs simply disappeared.

The key insight here was that ordering matters more than clever logic.
These systems don’t need to know about each other — they just need to run in the right sequence and respond to state changes upstream.


The Kind of Bugs That Only Show Up Once Things Are “Mostly Working”

Getting the ordering right was half the battle. The other half was dealing with the kind of bugs that only appear once the architecture is almost correct.

For example:

  • I wasn’t clearing the octree properly, which caused the engine to look for entities that no longer existed.
  • One particularly frustrating bug refused to render a specific LOD whenever two or more entities were visible at the same time.

That second one took an entire day to track down.

It turned out the space uniform was getting overwritten during the unload/load phase of the streaming system. Nothing fancy — just a subtle overwrite happening at exactly the wrong time.

That kind of bug is annoying, but it’s also a signal that the system boundaries are finally being exercised in realistic ways.


The Flickering Issue That Didn’t Behave Like a Flicker

The flickering issue was a different kind of problem.

It only showed up in Edit mode, not reliably in Game mode. And it wasn’t the usual continuous flicker you expect when something is wrong. Instead, it would flicker once, stabilize, then flicker again a few seconds later — or sometimes not at all during a debug session.

That made it especially hard to reason about.

At first, I assumed it was a synchronization issue between render passes. I tried adding fences, forcing stricter ordering — none of that helped.

The clue ended up being that the flicker correlated with moments when nothing should have been changing visually.


The Real Cause: State Falling Out of Sync

Eventually, I traced the issue back to the culling system.

In some frames, the culling pass was returning zero visible entities — not because nothing was visible, but because the visibleEntityIds buffer was getting overwritten.

The fix wasn’t to add more synchronization, but to acknowledge reality: the culling system was already using triple buffering, and visibleEntityIds needed to follow the same pattern.

Once I made visibleEntityIds triple-buffered as well, the flickering disappeared completely.

The takeaway here wasn’t “use triple buffering,” but:

Any system that consumes frame-dependent data must respect the buffering strategy of the system producing it.


Final Thoughts

None of the issues this week were caused by exotic bugs or broken math. They all came from small assumptions about ordering, ownership, and state lifetime.

Once those assumptions were corrected, the engine became noticeably more stable — not just faster, but easier to reason about.

That’s usually a good sign that the architecture is moving in the right direction.

Thanks for reading.

Untold Engine Updates: LOD, Static Batching and More !!!

Hey guys,

It’s me again with a new update on the Untold Engine — this time focused on user experience and performance.

You might find this a bit odd coming from an engineer, but user experience matters a lot to me. Sometimes, I even see it as more important than performance itself. I know, that sounds backwards. But honestly, I don’t care how fast a tool is if the user experience is bad. If it’s frustrating to use, then to me, it’s not a good product.

So let’s start with the user-experience improvements I’ve been working on.

BTW, you can read more about version 0.9.0 of the engine here.

Quick USDZ Preview

I was never happy with the fact that every time I wanted to render a model with the Untold Engine, I had to create a full project first. That felt unnecessary and slow.

So I added a Quick Preview feature.

You can now preview a .usdz file directly without creating or importing it into a project. Just click the Quick Preview button, select your .usdz file, and you’re good to go.

Improved Importing Workflow

Next up: importing.

The old importing workflow was confusing at times and a bit error-prone. It was too easy to accidentally import a model into the wrong category, which is never a good experience.

Now, when you click Import, you’re explicitly asked what you want to import. This makes the process clearer and significantly reduces the chances of loading assets into the wrong place.

Scenegraph Parenting Support

At some point, I realized I really wanted to create parent–child relationships between entities directly from the editor — but the Scenegraph didn’t support that at all.

So I added it.

You can now parent entities directly in the Scenegraph by dragging one entity onto another.
To unparent an entity, just right-click it in the Scenegraph and select Unparent.

That said, I think I can make the hierarchy more visually obvious. That might be the next thing I tackle so parent–child relationships are easier to spot at a glance.

Viewport Child Selection

This one was a complete oversight on my end.

If an entity had multiple child meshes and you tried to select one of those meshes in the viewport using a right-click, the parent entity would get selected instead. That’s… not great.

This was a terrible user experience, so I made it a priority to fix.

You can now select child entities directly in the viewport using Shift + Right Mouse Click, which makes working with hierarchical scenes much more intuitive.


Now, let’s talk about performance improvements.

LOD System

The Untold Engine now supports an LOD system.

You can assign an LOD Component to an entity, provide multiple versions of a mesh, and the engine will automatically select the appropriate LOD based on distance. This is especially useful when you want to maintain a steady 60 FPS without rendering fully detailed meshes when they’re not needed.

Static Batching System

The engine now also supports Static Batching.

This is extremely useful for scenes with a large number of static objects. By batching these meshes together, the engine can significantly reduce the number of draw calls it needs to execute.

In one test scene, draw calls dropped from over 2,000 to just 34. That’s a massive improvement and makes a huge difference in frame stability.


That’s all for now.

If you want to follow the development of the engine and stay up to date with future updates, make sure to check out the project on GitHub.

Thanks for reading.