top of page

PREAMBLE

BLOCK COMPRESSION HO!

Man, I've used texture compression my entire Tech Art career, and of course unknowingly most of my life between PNGs and JPGs etc. I remember quite vividly back in 2014 when Shane Calimlim first explained to me the difference between DXT1 and DXT5. And yet, for years, my Tech Artist "expertise" on the topic boiled down to: "BC4 is for grayscale, BC5 is for Normal Maps etc." For a Tech Artist who loves getting pretty low-level, I was no longer happy with that. Sure, I read the DirectX documentation before, but for someone like me, simply reading through white papers or watching GDC talks, it doesn't stick. In order to thoroughly grasp these algorithms, I need to implement them myself: and that's what I did.

 

This article is here to talk about how I wrote my own BC1 and BC7 file formats! To be clear, my Block Compression is purely software-based, writing and reading the images back out. There is no GPU hardware decoding, or anything like that involved.

 

My articles always kind of waffle between tutorial and anecdote. I'll try to give an overview of the basics, and focus on some parts that I really struggled with. If you'd like to know more, please feel to reach out, I'd love to hear from ya!

SOURCES (GOOD AND MEH)

When looking for Block Compression documentation, I'm sure alot of people first stumble onto the official Direct3D one:

Since BC1 is so straightforward, that documentation is all I needed to implement it. However, BC7 was another story. Frankly, I found the Direct3D documentation lacking in regards to BC7. For example, it doesn't even have a full list of all the partition sets! I jotted down copious amounts of notes and scribbles, trying to keep up with all the terminologies: subsets, p-bits etc.  I got to a point where I realized...this documentation wasn't enough for me. That's when I stumbled upon even better documentation:

BPTC is the OpenGL equivilent of BC7, and MAN this documentation illuminated so many of the holes that Direct3D left me with: particularly with p-bits, and a FULL list of Partition Tables for Two-Subset and Three Subsets!! I highly recommend using Khronos Group for BC7 breakdown instead of the Direct3D one. Sometimes, its just a matter of finding good sources and test data!

Lastly, I have to mention Nathan Reed's excellent breakdown of the seven Block Compression types. I'm sure plenty of folks are already familiar with it; I know I certainly read it a few years back. He breaks each down better than I ever could.  Great quick-read resource, pictures and all. I love his line about: "BC2 is a bit of an odd duck.." haha

NOT A FAN OF PYTHON

Yes, I used Python 3 and PyCharm IDE to develop this project.  Let me be the first guy to say I dislike programming in Python very much, and this project only solidifies my preference for C++ over Python.  Why did I do it then? Python is certainly an odd choice for bitwise-heavy algorithims. Well, I wanted to challenge myself to become stronger in Python. I was inspired by one of John Carmack's keynote speech where he described how when he learning Haskell, imerssing yourself in a heavy-duty project is the only real way to get to know the strengths and weaknesses of a language TO BE CONTINUED

 

The biggest advantage of Python, in my opinion is it's ubitiqity across Software Tools such as SideFX Houdini and Autodesk Maya. Using Python as a glue to interface between Houdini to a  proprietary engine blah blah blah is excellent! 
 

BC1

THE BASICS

Alright, let's quickly go over the basics for anyone unfamiliar.  Why don't videogames simply read-in PNGs or TGAs while the game is running?  I mean, we certainly use those formats when working in our editors. However, those conventional formats are not hardware supported. The BCn family of texture compressions are directly supported by modern GPUs, and can be decompressed very quickly!


To be clear: hardware texture compression does not only reduce memory footprint...it helps performance too!  BLAH BLAH BLAH

TODO:

Add BC1 Alpha vs Uncompressed Alpha Sphere etc

DOWNFALL OF BC1

TODO: Explain how rapid change of color (Noise) does not play well. Perhaps here is where Resolution affects results alot. But at that point, with resolutions the memory footprint delta between is only a few dozen KB, so perhaps for those BC7 is totally worth it.

CONCLUSION?

TODO: End BC1 Section somehow!

BC7

MORE DETAIL, MORE COMPLICATED

I went straight from the simplest to the most difficult and newest format. Introduced with DirectX 11, BC7 was exponentially more difficult to author. BLAH BLAH

Anecdotally, BC7 really saved my bacon!  As a Tech Artist working on, complex, interactive visuals: I encode alot of Data into Vertex Color, UV Channels and Textures.  We used all four channel of our Texture, each with special data. BC1-BC5 simply BUTCHERED our data, and so we had to resort to uncompressed RGBA8 textures.  However, several months in, one of Graphics Engineers implemented BC7 into our proprietary engine, and viola, my data was mostly intact!  BLAH BLAH TODO: talk about Blue Noise
!!!!!!!!!!!

WHAT DO ALL THESE TERMS MEAN

As I mentioned in the Sources section above, I felt that Microsoft's Documentation on BC7 didn't layout the terms clearly for me. This probably speaks of my shortcomings, but I needed to jot down notes, and reread the paragraphs several times to finally grasp what the characteristics of that make up the Modes meant. I hope this next section helps somebody understand how to implement a BC7 Tool faster than I did!

SUBSETS

TODO: Explain Subsets

PARTITIONS

TODO: Explain Subsets

INDICES

TODO: Explain This

P-BITS

TODO: Explain This

ENCODING THE MODE

TODO: Explain This

THE MODES

Opaque has four Modes, and I will go over them one-by-one and what I feel are their greatest strengths 

TODO: include chart breakdown for each Mode.

 

Also include examples of same image using different mode one at time

Mode 0

Best for Blocks with high color delta; it might be the best Mode for "Blue Noise" because it can (potentially) capture 24 unique colors! However, it has the weakest Color Depth, and is very bad with capturing gradients.

Mode 1
Because of 3-Bit Indices, Color Depth, and Partitions, this Mode is the best at smooth, accurate gradients. Two Subsets means its no good with "junction points" where colors shift in the image though. So as long as the "hue" is simular across the block, it's golden!
Mode 2
Great with a limited amount of high-contrast colors. Simular to Mode 0, but swapping more partition choices for measly 2-Bit Indices. If there are no more than six colors, it'll preserve the data very well.
Mode 3
Has the highest Color Depth, and can capture a limited number of colors very accurately. The best Mode for if the Block can be split into two distinct gradient patterns. Of the four opaque Modes, it can (at best) capture 8 unique colors...

THE MODES

Opaque has four Modes, and I will go over them one-by-one and what I feel are their greatest strengths 

TODO: include chart breakdown for each Mode.

 

Also include examples of same image using different mode one at time

TIMING

No doubt about it, BC7 takes dreadfully than longer than BC1 to write out an image. This is because, for each 4x4 Block, BC7 needs to compare "the score" of every Mode to determine which is the most suited Mode to use. On top of that, every Mode needs to choose the appropriate Partition

MAYBE PUT THIS EARLIER IN THE BC7 SECTION!!

TODO: Add Chart Showing How Long BC7 vs BC1

REMINDS ME OF MEMORY MAPPERS

Man, I LOVE bitwise operations and radical optimization. By the time I started on BC7,  I realized that these Block Compression formats reminded alot of a few years back when I was implementing Memory Mappers for Conntendo.  I'll give the anology that BC1 is like the UxROM Mapper while BC7 is like MMC5 BLAH BLAH

TODO: Add Noise Comparison Image

TODO: Add RyuNina Image with Debug Color the Modes
 

CONCLUSION?

TODO: End BC7 Section somehow!

bottom of page