Dev Diary #9: PNG Decoding
The content below is rendered by parsing PNG files and rendering them to a Canvas. I did this as a way to test a library I’m developing structo.ts.
| Width | ?px |
| Height | ?px |
| Bit Depth | ?bits |
| Format | ? |
| Interlacing | None |
You can select the options at the bottom to show what line is generated by what filter.
I’m not going to go into precise detail into how the format is layed out, for that I’d recommend this spec. But I am gonna give a brief overview to help you out if your trying to implement it yourself.
Format
A PNG image is a series of chunks, each chunk holds a checksum and some dataHere’s the pseudo code for how each chunk is layed out
{
length: u32("big-endian"),
type: bytes(4),
data: bytes(length),
crc: crc32(data),
}
The main chunk types are the Image Header (IHDR) and Data Chunks (IDAR).
Image Header(IHDR)
This is always the first chunk in a PNG file, it holds the core metadata every file must have:
- The dmage dimensions
- The bit depth
- The colour format
- Is the image interlaced?
Data Chunks (IDAT)
This is a series of chunks in a row that contain the image actual data, compressed using DEFLATE.
It’s often seperate chunks that concatonated together to form the full data. By splitting it into seperate chunks, the checksum can be split across multiple chunks to eliminate the need to download the full file before rendering.
Filters
In order to take more advantage of the compression algorithm Each line of pixels can have a filter applied to it, which is defined by a prefix byte at the start of each scanline.
These are performed at the byte level, not the pixel level. By applying these filters correctly, A png file can better take advantage of compression to reduce file size.
Each filter works by subtracting a value on the encode and adding it back when decoding, leading to a lossless modification.
- None (0)
- No filtering is applied.
- Sub (1)
- Uses the left pixel in a line to affect the current one.
- Encode
bytes[x] - bytes[x - bpp] - Decode
bytes[x] + bytes[x - bpp]
- Up (2)
- Uses the above pixel to affect the current one.
- Encode
bytes[x] - previousLine[x] - Decode
bytes[x] + previousLine[x]
- Average (3)
- Uses both the above and left to affect the current one.
- Encode
bytes[x] - floor((bytes[x - bpp] + previousLine[x]) / 2) - Decode
bytes[x] + floor((bytes[x - bpp] + previousLine[x]) / 2)
- Paeth (4)
- Uses the tree neighbouring pixels, left, above and top-left to calculate the current one.
- Encode
bytes[x] - PaethPredictor(bytes[x - bpp], previousLine[x], previousLine[x - bpp]) - Decode
bytes[x] + PaethPredictor(bytes[x - bpp], previousLine[x], previousLine[x - bpp])
Notes:
bppis bytes per pixel, subtracting means going to the same byte in the previous pixelpreviousLineis the bytes for the line above this one- For out of bounds access, assume all zeros
- All operations are performed with
mod 256to enable wraparound.
Interesting most files I tested did not actually use any filter other than none,
this meant they were essentially bitmaps with DEFLATE compression.
Structo
While learning PNG files was a main goal of this project, I also wanted to try implementing something using (structo)[https://github.com/Ben-Brady/structo-ts].
This is the full code for parsing the PNG file:
import * as st from "@nnilky/structo";
export type PngImage = {
data: ArrayBuffer;
width: number;
height: number;
bitDepth: number;
colorType: number;
isInterlaced: boolean;
};
export function parsePngFile(data: ArrayBuffer): PngImage {
const file = st.read(PngFile, data);
const chunk = chunks.find(v => v.type === "IHDR")!;
if (!chunk) throw new Error("IHDR not found");
const ihdr = st.read(IHDR, chunk.data);
const data = getImageData(file.chunks);
return {
data,
width: ihdr.width,
height: ihdr.height,
bitDepth: ihdr.bitdepth,
colorType: ihdr.colorType,
isInterlaced: ihdr.interlace === 1,
};
}
const MagicNumber = st.pipe(
st.bytes(8),
st.toHex(),
st.literal("89504E470D0A1A0A"),
);
const lengthValue = st.createReference<number>();
const dataValue = st.createRememberedValue<ArrayBuffer>();
type Chunk = st.InferOutput<typeof Chunk>;
const Chunk = st.object({
length: lengthValue.pointer(st.u32("big")),
type: st.pipe(st.bytes(4), st.toAscii()),
data: dataValue.save(st.sizedBytes(lengthValue.deref())),
crc: st.u32(),
});
type PngFile = st.InferOutput<typeof PngFile>;
const PngFile = st.object({
header: MagicNumber,
chunks: st.exhuastiveArray(Chunk),
});
type IHDR = st.InferOutput<typeof IHDR>;
const IHDR = st.object({
width: st.u32("big"),
height: st.u32("big"),
bitDepth: st.u8(),
colorType: st.u8(),
compression: st.u8(),
filter: st.u8(),
interlace: st.u8(),
});
function getImageData(chunks: Chunk[]): ArrayBuffer {
const dataChunks = chunks.filter(v => v.type === "IDAT");
const size = sum(dataChunks.map(v => v.data.byteLength));
const data = new Uint8Array(size);
let offset = 0;
for (const chunk of dataChunks) {
data.set(new Uint8Array(chunk.data), offset);
offset += chunk.data.byteLength;
}
return data.buffer;
}
const sum = (numbers: number[]) => numbers.reduce((a, b) => a + b, 0);