BLOCK COMPRESSION | connburanicz

Girl Who Thinks She is Piggie

128x128, notice especially around the straw and the children's mouths in the background

Self Portrait 2023

64x64, the BC7 is very close, but notice the lower face is completely 4-Color "purplelized"

Megaman Legends

256x256, the BC1 Blocks are especially noticable on Rock's leg and the Van

Girl Who Thinks She is Piggie

128x128, notice especially around the straw and the children's mouths in the background

1/3

PREAMBLE

BLOCK COMPRESSION HO!

Man, I've used texture compression my entire Tech Art career, and of course unknowingly most of my life between PNGs and JPGs etc. I remember quite vividly back in 2014 when Shane Calimlim first explained to me the difference between DXT1 and DXT5. And yet, for years, my Tech Artist "expertise" on the topic boiled down to: "BC4 is for grayscale, BC5 is for Normal Maps etc." For a Tech Artist who loves getting pretty low-level, I was no longer happy with that. Sure, I read the DirectX documentation before, but for someone like me, simply reading through white papers or watching GDC talks, it doesn't stick. In order to thoroughly grasp these algorithms, I need to implement them myself: and that's what I did.

This article is here to talk about how I wrote my own BC1 and BC7 file formats! To be clear, my Block Compression is purely software-based, writing and reading the images back out. There is no GPU hardware decoding, or anything like that involved.

My articles always kind of waffle between tutorial and anecdote. I'll try to give an overview of the basics, and focus on some parts that I really struggled with. If you'd like to know more, please feel to reach out, I'd love to hear from ya!

SOURCES (GOOD AND MEH)

When looking for Block Compression documentation, I'm sure alot of people first stumble onto the official Direct3D one:

Texture Block Compression in Direct3D 11

Since BC1 is so straightforward, that documentation is all I needed to implement it. However, BC7 was another story. Frankly, I found the Direct3D documentation lacking in regards to BC7. For example, it doesn't even have a full list of all the partition sets! I jotted down copious amounts of notes and scribbles, trying to keep up with all the terminologies: subsets, p-bits etc. I got to a point where I realized...this documentation wasn't enough for me. That's when I stumbled upon even better documentation:

Khronos Group BPTC Overview for OpenGL

BPTC is the OpenGL equivilent of BC7, and MAN this documentation illuminated so many of the holes that Direct3D left me with: particularly with p-bits, and a FULL list of Partition Tables for Two-Subset and Three Subsets!! I highly recommend using Khronos Group for BC7 breakdown instead of the Direct3D one. Sometimes, its just a matter of finding good sources and test data!

Lastly, I have to mention Nathan Reed's excellent breakdown of the seven Block Compression types. I'm sure plenty of folks are already familiar with it; I know I certainly read it a few years back. He breaks each down better than I ever could. Great quick-read resource, pictures and all. I love his line about: "BC2 is a bit of an odd duck.." haha

Understand BCN Texture Compression

My Block Compression Scribbles

Front and back, these papers helped me organize my thoughts, keep a To-Do List, and mad ravings!

My Block Compression Scribbles

Front and back, these papers helped me organize my thoughts, keep a To-Do List, and mad ravings!

1/1

NOT A FAN OF PYTHON

Yes, I used Python 3 and PyCharm IDE to develop this project. Let me be the first guy to say I dislike programming in Python very much, and this project only solidifies my preference for C++ over Python. Why did I do it then? Python is certainly an odd choice for bitwise-heavy algorithims. Well, I wanted to challenge myself to become stronger in Python. I was inspired by one of John Carmack's keynote speech where he described how when he learning Haskell, imerssing yourself in a heavy-duty project is the only real way to get to know the strengths and weaknesses of a language TO BE CONTINUED

The biggest advantage of Python, in my opinion is it's ubitiqity across Software Tools such as SideFX Houdini and Autodesk Maya. Using Python as a glue to interface between Houdini to a proprietary engine blah blah blah is excellent!

BC1

THE BASICS

Alright, let's quickly go over the basics for anyone unfamiliar. Why don't videogames simply read-in PNGs or TGAs while the game is running? I mean, we certainly use those formats when working in our editors. However, those conventional formats are not hardware supported. The BCn family of texture compressions are directly supported by modern GPUs, and can be decompressed very quickly!

To be clear: hardware texture compression does not only reduce memory footprint...it helps performance too! BLAH BLAH BLAH

BC1 Vs Downsampling

A 256x256 image of Robocop. Notice how much more detail is preserved using BC1 compared to downsampling to get equivilent memory footprint!

BC1 Vs Downsampling

A 256x256 image of Robocop. Notice how much more detail is preserved using BC1 compared to downsampling to get equivilent memory footprint!

1/1

TODO:

Add BC1 Alpha vs Uncompressed Alpha Sphere etc

Sphere Alpha Comparison

Comparing BC1's One-Bit Stencil Alpha to full 8-Bit Alpha

Message Alpha Comparison

Showcasing how much the Alpha Threshold changes the image

Sphere Alpha Comparison

Comparing BC1's One-Bit Stencil Alpha to full 8-Bit Alpha

1/2

DOWNFALL OF BC1

TODO: Explain how rapid change of color (Noise) does not play well. Perhaps here is where Resolution affects results alot. But at that point, with resolutions the memory footprint delta between is only a few dozen KB, so perhaps for those BC7 is totally worth it.

Compression vs Resolution

The Top Row is the original, uncompressed images decreasing in resolution. The Bottom Row are images using BC1. Notice how BC1 has a harder time maintaining the image the lower the resolution becomes

Compression vs Resolution

The Top Row is the original, uncompressed images decreasing in resolution. The Bottom Row are images using BC1. Notice how BC1 has a harder time maintaining the image the lower the resolution becomes

1/1

CONCLUSION?

TODO: End BC1 Section somehow!

BC7

MORE DETAIL, MORE COMPLICATED

I went straight from the simplest to the most difficult and newest format. Introduced with DirectX 11, BC7 was exponentially more difficult to author. BLAH BLAH

Anecdotally, BC7 really saved my bacon! As a Tech Artist working on, complex, interactive visuals: I encode alot of Data into Vertex Color, UV Channels and Textures. We used all four channel of our Texture, each with special data. BC1-BC5 simply BUTCHERED our data, and so we had to resort to uncompressed RGBA8 textures. However, several months in, one of Graphics Engineers implemented BC7 into our proprietary engine, and viola, my data was mostly intact! BLAH BLAH TODO: talk about Blue Noise!!!!!!!!!!!

Nina Closeup Compared

64x64 Closeup of Nina's Face. BC7 is able to match original pretty closely while BC1 struggles of course with the usual issues

Nina Closeup Compared

64x64 Closeup of Nina's Face. BC7 is able to match original pretty closely while BC1 struggles of course with the usual issues

1/1

WHAT DO ALL THESE TERMS MEAN

As I mentioned in the Sources section above, I felt that Microsoft's Documentation on BC7 didn't layout the terms clearly for me. This probably speaks of my shortcomings, but I needed to jot down notes, and reread the paragraphs several times to finally grasp what the characteristics of that make up the Modes meant. I hope this next section helps somebody understand how to implement a BC7 Tool faster than I did!

SUBSETS

TODO: Explain Subsets

PARTITIONS

TODO: Explain Subsets

INDICES

TODO: Explain This

P-BITS

TODO: Explain This

ENCODING THE MODE

TODO: Explain This

Modes Visualized: Ryu and Nina

256x256, it's funny how Mode3 was selected for the entire white border around the characters, creating a green border

Modes Visualized: Girl is Piggie

A pretty even distribution between the four opaque BC7 Modes, with Mode1 dominating as usual

Modes Visualized: Ryu and Nina

256x256, it's funny how Mode3 was selected for the entire white border around the characters, creating a green border

1/2

THE MODES

Opaque has four Modes, and I will go over them one-by-one and what I feel are their greatest strengths

TODO: include chart breakdown for each Mode.

Also include examples of same image using different mode one at time

Mode 0

Best for Blocks with high color delta; it might be the best Mode for "Blue Noise" because it can (potentially) capture 24 unique colors! However, it has the weakest Color Depth, and is very bad with capturing gradients.

Mode 1
Because of 3-Bit Indices, Color Depth, and Partitions, this Mode is the best at smooth, accurate gradients. Two Subsets means its no good with "junction points" where colors shift in the image though. So as long as the "hue" is simular across the block, it's golden!

Mode 2
Great with a limited amount of high-contrast colors. Simular to Mode 0, but swapping more partition choices for measly 2-Bit Indices. If there are no more than six colors, it'll preserve the data very well.

Mode 3
Has the highest Color Depth, and can capture a limited number of colors very accurately. The best Mode for if the Block can be split into two distinct gradient patterns. Of the four opaque Modes, it can (at best) capture 8 unique colors...

BC7 MODE 0

The Only Mode with 16 Partition Choices

BC7 MODE 0

The Only Mode with 16 Partition Choices

1/1

BC7 Mode 1

Only Mode with Shared P-Bit, so it's extra difficult for that P-Bit to truly be useful

BC7 Mode 1

Only Mode with Shared P-Bit, so it's extra difficult for that P-Bit to truly be useful

1/1

BC7 Mode 2

Three Subsets, with high Partition Selection means it has the best chance of finding the right "jigsaw piece"

BC7 Mode 2

Three Subsets, with high Partition Selection means it has the best chance of finding the right "jigsaw piece"

1/1

BC7 Mode 3

Great Color Depth means it could be very good with a gradient block, but 2-Bit Indices means it could also be pretty bad!

BC7 Mode 3

Great Color Depth means it could be very good with a gradient block, but 2-Bit Indices means it could also be pretty bad!

1/1

THE MODES

Opaque has four Modes, and I will go over them one-by-one and what I feel are their greatest strengths

TODO: include chart breakdown for each Mode.

Also include examples of same image using different mode one at time

BC7 Modes Compared

Each Image shows compression using one Mode Only. Helps showcase the strengths/weaknesses between the Modes

Mode Strengths per Corner

A 16x16 Image where each 8x8 Corner was tailor-made to show the best of Modes 0, 1, 2, and 3

BC7 Modes Compared

Each Image shows compression using one Mode Only. Helps showcase the strengths/weaknesses between the Modes

1/2

Modes Compared: Delta

The Pixels are compared with original. The whiter the pixel the more inaccurate the Pixel has become compared to the original

BC7 Modes Compared with Robocop

Here is Robocop completely compressed using one Mode each. By far the biggest difference you'll notice is the background:: where Mode 0 and 2 have alot more banding because of the weakness with gradients. The reflections in Robocop's suit come out slightly stronger though

Modes Compared: Delta

The Pixels are compared with original. The whiter the pixel the more inaccurate the Pixel has become compared to the original

1/2

TIMING

No doubt about it, BC7 takes dreadfully than longer than BC1 to write out an image. This is because, for each 4x4 Block, BC7 needs to compare "the score" of every Mode to determine which is the most suited Mode to use. On top of that, every Mode needs to choose the appropriate Partition

MAYBE PUT THIS EARLIER IN THE BC7 SECTION!!

TODO: Add Chart Showing How Long BC7 vs BC1

Modes Visualized: Partition Tables Checker Layout

Mode2 is selected for the 3-Subset Blocks while Mode3 is selected for the 2-Subset Blocks as they should be

Modes Visualized: Partition Tables HalfHalf Layout

More examples of Modes being correctly chosen based on which Mode has the correct number of Partitions and best Color Depth

Modes Visualized: Color Gradient

For gradients, Mode1 dominates because of its 6-Bit (plus P-Bit) Color Depth and its 3-Bit Indices. Although Mode 2 and 3 win out as well

Modes Visualized: Partition Tables Checker Layout

Mode2 is selected for the 3-Subset Blocks while Mode3 is selected for the 2-Subset Blocks as they should be

1/3

REMINDS ME OF MEMORY MAPPERS

Man, I LOVE bitwise operations and radical optimization. By the time I started on BC7, I realized that these Block Compression formats reminded alot of a few years back when I was implementing Memory Mappers for Conntendo. I'll give the anology that BC1 is like the UxROM Mapper while BC7 is like MMC5 BLAH BLAH

TODO: Add Noise Comparison Image

TODO: Add RyuNina Image with Debug Color the Modes

CONCLUSION?

TODO: End BC7 Section somehow!