This page collects a number of example codes that show how to create PDF files, alter them, extract images and text, and some other tasks commonly needed when working with PDFs. For the most part they use the Apache PDFBox library, but there's also an OpenPDF example, and some other libraries are discussed below.

  • PdfBox is an example of how to create a PDF file with PDFBox, and how to use a few of its other capabilities

  • PdfboxAnnotation shows how to alter the appearance of text in a PDF (highlighting, underlining and striking out) using PDFBox

  • PdfboxTable shows how to add tables to a PDF using the PDFBox and Boxable libraries

  • PdfboxReplace shows how to search and replace text in a PDF in some circumstances using PDFBox

  • PdfToImage shows how to create an image for each page of a PDF using PDFBox

  • ExtractBarcodeFromPdf shows how to extract barcode values from a PDF using the PDFBox and XZing libraries

If any of my code snippets and examples has been helpful to you or your company, and you feel like expressing your gratitude beyond saying Thank you, please note that I have an Amazon Wish List containing several inexpensive items, or you can contribute directly via PayPal.

Open Source libraries

  • Apache PDFBox - library that can create, merge, split and print PDFs, extract text, create images from PDFs, encrypt/decrypt PDFs, fill in PDF forms and more.

  • OpenPDF is a library to create PDFs built on top of iText2, but still licensed under a business-friendly license.
    • The OpenPDF API is in some aspects easier to use than PDFBox, but its development velocity is slower. So for the long term I think PDFBox is the better bet. A good introduction to OpenPDF is the first edition of the book iText in Action (not the second edition, which describes an API different from OpenPDF).
    • more examples
    • javadocs

  • Apache FOP - library to create PDFs (and other formats) from XML by using XSL-FO transformations
    • This is a good option if you want to create the PDF from mostly regular data that can be described in XML, and you don't mind maintaining XSL-FO stylesheets.

  • FlyingSaucer - library to convert CSS-styled XHTML to PDF
    • This is a very good option if you want to create PDFs from CSS-styled XHTML pages. These might be easier to write than to create a solution using the native API of some PDF library.