Please check out the new release of Aspose.Recognition. Now it supports import of images and has gained some other important features that improve quality of text layout recognition.
The size of the image feature in terms of pdf spec pages was quite large so it took us a bit more than an average month to deliver a new release.
When stress testing Aspose.Recognition on e-books with 700+ pages it was noted that after import some of them have nearly 10 000 images, which results in "special effects" when they are opened by MS Word.
Analysis showed that some pdf producers cut images into parts possibly to speed up rendering. This idea taken to the extreme results in documents where each line of an image is represented as image itself. For image 600 pixels high think about having 600 images that are 1 pixel high all placed on one page. To solve this issue a special processing algorithm was developed that searches for adjacent images and merges them into a whole.
The download page is below:
Please don't forget to share your opinion or report any issues using product support forum: