When developing an application that outputs PDFs such as forms, have you ever wanted to automatically test the final output PDF including the layout? In this document, I will show you how to perform regression testing including layout by imaging and comparing two PDF files.
Imaging PDFs is easier than you might think with Apache PDFBox (https://pdfbox.apache.org/). As mentioned earlier, the purpose is an automatic regression test, so if the number of pages or page size is different, the test will fail and give up.
The larger the DPI for imaging, the more precise comparisons can be made using high-resolution images, but machine resources (CPU, memory) are required accordingly.
static void assertPdfEquals(InputStream expected, InputStream actual) throws IOException {
try (PDDocument doc1 = PDDocument.load(expected);
PDDocument doc2 = PDDocument.load(actual)) {
//Test failed if the number of pages is different
assertEquals(doc1.getNumberOfPages(), doc2.getNumberOfPages());
PDFRenderer renderer1 = new PDFRenderer(doc1);
PDFRenderer renderer2 = new PDFRenderer(doc2);
for (int i = 0; i < doc1.getNumberOfPages(); i++) {
BufferedImage image1 = renderer1.renderImageWithDPI(i, 144, ImageType.RGB);
BufferedImage image2 = renderer2.renderImageWithDPI(i, 144, ImageType.RGB);
//Test fails even if the size is different
assertEquals(image1.getWidth(), image2.getWidth());
assertEquals(image1.getHeight(), image2.getHeight());
//Test image match and output diff image to temporary file if they do not match
Path path = Files.createTempFile("diff-" + i + "-", ".png ");
try (OutputStream os = Files.newOutputStream(path)) {
assertTrue(compareImage(image1, image2, os), path);
}
}
}
}
Comparing images is not particularly difficult as long as you only check the exact match of RGB values pixel by pixel. The point is not just to compare, but to repaint the mismatched pixels with a highlight color to create a diff image.
static boolean compareImage(BufferedImage image1, BufferedImage image2, OutputStream os) throws IOException {
boolean matched = true;
for (int x = 0; x < image1.getWidth(); x++) {
for (int y = 0; y < image1.getHeight(); y++) {
int p1 = image1.getRGB(x, y);
int p2 = image2.getRGB(x, y);
//Pixels that match are left as they are, and pixels that do not match are changed to magenta.
if (p1 != p2) {
matched = false;
image1.setRGB(x, y, Color.MAGENTA.getRGB());
}
}
}
//Output the difference image
if (os != null) {
ImageIO.write(image1, "png", os);
}
return matched;
}
In contrast to the expected PDF, in the actual PDF, "the date of the heading has been added", "item 3 of the item has been deleted", and "subtotals and totals have changed due to the deletion of item 3". Can be read by comparing it with the difference image.
Expected PDF
Actual PDF
Difference image
//Matched pixels are black, unmatched pixels are white
if (p1 == p2) {
image1.setRGB(x, y, Color.BLACK.getRGB());
} else {
matched = false;
image1.setRGB(x, y, Color.WHITE.getRGB());
}
I introduced how to perform a snapshot test by converting the PDF output to an image. There is a limit to how humans can visually perform regression testing when trying to confirm that "the data is not just output, but is displayed in the correct layout" like PDF.
If you use the method introduced this time, you will not only detect the degreasing of the application you implement, but you will also be aware of unintended layout changes when you upgrade the PDF output library, so you can develop with more peace of mind. There is none.
Recommended Posts