Extract Text to HTML Format
PdfExtractor class allows you to extract text to HTML format. You can use extractTextAsHTML method of PdfExtractor class. You can either extract text from all the pages or specify a range of pages. The extractTextAsHTML method will save the extracted text as HTML format.
PdfExtractor class allows you to extract text to HTML format. You can use extractTextAsHTML method of PdfExtractor class. You can either extract text from all the pages or specify a range of pages. The extractTextAsHTML method will save the extracted text as HTML format.
This example shows you how to extract text from PDF to HTML format.
[Java]
//create PdfExtractor object PdfExtractor extractor = new PdfExtractor(); //bind input pDF file extractor.bindPdf("input.pdf"); //set start and end pages extractor.setStartPage(1); extractor.setEndPage(2); //extract text extractor.extractText(); //save extracted text as HTML extractor.extractTextAsHTML("output.html"); //close PdfExtractor object extractor.close();
