This page collects a number of example codes that show how to create PDF files, alter them, extract images and text, and some other tasks commonly needed when working with PDFs. For the most part they use the Apache PDFBox
library, but there's also an OpenPDF
example, and some other libraries are discussed below.
- PdfBox is an example of how to create a PDF file with PDFBox, and how to use a few of its other capabilities
- PdfboxAnnotation shows how to alter the appearance of text in a PDF (highlighting, underlining and striking out) using PDFBox
- PdfboxTable shows how to add tables to a PDF using the PDFBox and Boxable libraries
- PdfboxLayout shows how to wrap text, align text and use markup using the PDFBox and pdfbox-layout libraries
- PdfboxReplace shows how to search and replace text in a PDF in some circumstances using PDFBox
- PdfToImage shows how to create an image for each page of a PDF using PDFBox
- ExtractBarcodeFromPdf shows how to extract barcode values from a PDF using the PDFBox and XZing libraries
|If any of my code snippets and examples has been helpful to you or your company, and you feel like expressing your gratitude beyond saying Thank you, please note that I have an Amazon Wish List containing several inexpensive items, or you can contribute directly via PayPal. |
Open Source libraries
- Apache PDFBox - library that can create, merge, split and print PDFs, extract text, create images from PDFs, encrypt/decrypt PDFs, fill in PDF forms and more.
- more examples
- PDFBox has a number of very handy tools for manipulating existing PDF files, for example Decrypt, Encrypt, ExtractImages?, ExtractText?, Overlay, Merge, Split
- Boxable is a library that can be used with PDFBox to create tables in PDFs more easily
- pdfbox-layout is a library that adds several useful features to PDFBox, like text runs composed of multiple chunks, support for markup like bold and italic, alignment and word wrapping
- OpenPDF is a library to create PDFs built on top of iText2, but still licensed under a business-friendly license.
- The OpenPDF API is in some aspects easier to use than PDFBox, but its development velocity is slower. So for the long term I think PDFBox is the better bet. A good introduction to OpenPDF is the first edition of the book iText in Action (not the second edition, which describes an API different from OpenPDF).
- more examples
- Apache FOP - library to create PDFs (and other formats) from XML by using XSL-FO transformations
- This is a good option if you want to create the PDF from mostly regular data that can be described in XML, and you don't mind maintaining XSL-FO stylesheets.
- FlyingSaucer - library to convert CSS-styled XHTML to PDF
- This is a good option if you want to create PDFs from CSS-styled XHTML pages. These might be easier to write than to create a solution using the OpenPDF or PDFBox APIs.
- On Android, there's the Android PDF Writer library. It's quite basic, but nicely small and it gets the job done. The ready-to-use jar file can be found here
- Alternatively, it's also possible to use OpenPDF together with the android-awt library. This adds several MB to the app's size, though.