Search / highlight text from scanned document

Hello,


I’m using the free trial version of aspose pdf.
I’m trying to find and highlight words in a pdf document.
All is ok but when I want to use it for scanned document, the words are highlight but the text is no visible, it is under the yellow highlighted area…

I join you the pdf file.

Another question, When I’m searching for the word “win”, It highlight some end of words like “growing” and it doesn’t takes the word “Win” with the uppercase. Is this normal ?

Thank you.

nicolas.allerhand:
I’m trying to find and highlight words in a pdf document.
All is ok but when I want to use it for scanned document, the words are highlight but the text is no visible, it is under the yellow highlighted area…
Hi Nicolas,

Thanks for contacting support.

I
have tested the scenario and I am able to reproduce the same problem. For the
sake of correction, I have logged it in our issue tracking system as PDFNEWNET-39278. We
will investigate this issue in details and will keep you updated on the status
of a correction. <o:p></o:p>

We apologize for your inconvenience.

nicolas.allerhand:
Another question, When I'm searching for the word "win", It highlight some end of words like "growing" and it doesn't takes the word "Win" with the uppercase. Is this normal ?
In order to select text both in Upper case and Lower case, please try using Regular expression.

TextFragmentAbsorber absorber = new TextFragmentAbsorber("(?i)Win", new TextSearchOptions(true));

Hi Nayyer,


Thank to reply,

I’ll test the regular expression this morning.

By the way, in place of highlighting the text, is it possible to frame it ?

Thanks you.

Hi Nicolas,


Thanks for your inquiry. Yes you can use annotations for highligting your desired text. Please use Highlight annotation for the purpose. Please check following code snippet for the purpose, hopefully it will help you to accomplish the task.



Document document = new Document(myDir

  • “20072015045240._1.pdf”);<o:p></o:p>

TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber("(?i)Win");

//set text search option to specify regular expression usage

TextSearchOptions textSearchOptions = new TextSearchOptions(true);

textFragmentAbsorber.TextSearchOptions = textSearchOptions;

document.Pages.Accept(textFragmentAbsorber);

TextFragmentCollection textFragmentCollection1 = textFragmentAbsorber.TextFragments;

foreach (TextFragment textFragment in textFragmentCollection1)

{

Aspose.Pdf.InteractiveFeatures.Annotations.HighlightAnnotation freeText = new Aspose.Pdf.InteractiveFeatures.Annotations.HighlightAnnotation(textFragment.Page, new Aspose.Pdf.Rectangle((float)textFragment.Position.XIndent,

(float)textFragment.Position.YIndent, (float)textFragment.Position.XIndent + (float)textFragment.Rectangle.Width,

(float)textFragment.Position.YIndent + (float)textFragment.Rectangle.Height));

freeText.Opacity = 0.5;

freeText.Color = Aspose.Pdf.Color.FromRgb(System.Drawing.Color.Yellow);

textFragment.Page.Annotations.Add(freeText);

}

document.Save(myDir + "texthighlight_output.pdf");


Please fee free to contact us for any further assistance.

Best Regards,

Thanks a lot, this resolved all my problems.


Have a good day

Hi Nicolas,


Thanks for your feedback. It is good to know that you have managed to accomplish your requirements.

Please keep using Aspose and feel free to contact us for any further assistance.We will be more than happy to extend our support.

Best Regards,