Apache PDFBox · JSON Structure
Apache Pdfbox Text Extraction Result Structure
TextExtractionResult schema from Apache PDFBox
Document ProcessingJavaPDFText ExtractionApacheOpen Source
TextExtractionResult is a JSON Structure definition published by Apache PDFBox, describing 4 properties. It conforms to the https://json-structure.org/meta/core/v0/# meta-schema.
Properties
documentId
text
pageCount
wordCount
Meta-schema: https://json-structure.org/meta/core/v0/#