Sign In  Sign Up Live-Chat

ExtractText() returns blank

Last post 08-31-2008, 10:52 AM by Felix.Liu. 3 replies.
Sort Posts: Previous Next
  •  08-27-2008, 11:34 AM 141482

    ExtractText() returns blank

    Attachment: Present (inaccessible)

    I have attempted to use ExtractText to access the text of the attached PDF document.   It returns an empty string.    Can you please explain why?

    My code is below:

    PdfExtractor extractor = new PdfExtractor();

    extractor.BindPdf(fileName);

    extractor.ExtractText();

    extractor.GetText("~text.tmp");

    string strRet = File.ReadAllText("~text.tmp");

    File.Delete("~text.tmp");

    return strRet;

     

     
  •  08-27-2008, 11:54 AM 141483 in reply to 141482

    Re: ExtractText() returns blank

    Hi,

    We apologize for your inconvenience. I have tested the issue and I’m able to reproduce the same problem. I have logged it in our issue tracking system as PDFKITNET-5841. We will investigate this issue in detail and will keep you updated on the status of a correction.


    Nayyer Shahbaz
    Support Developer
    Aspose Changsha Team
    About Us
    Contact Us
     
  •  08-30-2008, 5:06 PM 141845 in reply to 141483

    Re: ExtractText() returns blank

    This is an important issue for me.  My evaluation cannot proceed without this fix.   Can you provide a rough estimate as to how long i can expect to wait for a bug fix. 

     

     
  •  08-31-2008, 10:52 AM 141861 in reply to 141845

    Re: ExtractText() returns blank

    Hi,

    We are working on this issue and we find that in the attached PDF file is secured, you need decrypt the file before extracting text from it. please try the following code:

    PdfFileSecurity security = new PdfFileSecurity(@"d:/specialpdfs/PLNASmithPDFFlatMaybe.pdf",@"d:/specialpdfs/temp.pdf");
       security.DecryptFile(""); //decrypt the pdf file with owner password(blank)
       PdfExtractor extractor = new PdfExtractor();
       extractor.BindPdf(@"d:/specialpdfs/temp.pdf"); //bind the decrypted file
       extractor.ExtractText();
       extractor.GetText(OutPath+"extractNoWords.txt");

    For more details about Pdf security, please refer to here.

    Thanks,


    Felix Liu
    Developer
    Aspose Changsha Team
    About Us
    Contact Us
     
View as RSS news feed in XML