Extract Images from PDF and recognize BarCodes

Skip to end of metadata
Go to start of metadata
The PDF documents are usually comprised of Text, Image, Table, Attachments, Graph, Annotation and other related objects. There are cases when Barcodes are embedded inside PDF file and some customers have the requirement to identify the Barcodes present inside the PDF file. The following article explains the steps on how to extract images from PDF pages and identify the Barcodes inside them.

According to Document Object Model of Aspose.Pdf for .NET, a PDF file contains one or more pages where each page contains collection of Images, Forms and Fonts in Resources object. So in order to extract images from PDF file, we will traverse through individual pages of PDF file, get the collection of Images from particular page and save them in MemoryStream object for further processing with BarCodeReader class of Aspose.BarCodeRecognition.

C#
//open document
Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document("source.pdf");
// traverse through individual pages of PDF file
for (int pageCount = 1; pageCount <= pdfDocument.Pages.Count; pageCount++)
{
    // traverse through each image extracted from PDF pages
    foreach (XImage xImage in pdfDocument.Pages[pageCount].Resources.Images)
    {
        using (MemoryStream imageStream = new MemoryStream())
        {
            //save output image
            xImage.Save(imageStream, System.Drawing.Imaging.ImageFormat.Jpeg);
            // set the stream position to the begining of Stream
            imageStream.Position = 0;

            // Instantiate BarCodeReader object
            Aspose.BarCodeRecognition.BarCodeReader barcodeReader = new Aspose.BarCodeRecognition.BarCodeReader(imageStream, Aspose.BarCodeRecognition.BarCodeReadType.Code39Extended);

            while (barcodeReader.Read())
            {
                // get BarCode text from BarCode image
                string code = barcodeReader.GetCodeText();
                // write the BarCode text to Console output
                Console.WriteLine("BARCODE : " + code);
            }
            // close BarCodeReader object to release the Image file
            barcodeReader.Close();
        }
    }
}
VB.NET
'open document
Dim pdfDocument As New Document("source.pdf")
' traverse through individual pages of PDF file
For pageCount As Integer = 1 To pdfDocument.Pages.Count

    ' traverse through each image extracted from PDF pages
    For Each XImage In pdfDocument.Pages(pageCount).Resources.Images

        Using imageStream As New MemoryStream()
            'save output image
            XImage.Save(imageStream, System.Drawing.Imaging.ImageFormat.Jpeg)

            ' set the stream position to the begining of Stream
            imageStream.Position = 0

            ' Instantiate BarCodeReader object
            Dim barcodeReader As Aspose.BarCodeRecognition.BarCodeReader = New Aspose.BarCodeRecognition.BarCodeReader(imageStream, Aspose.BarCodeRecognition.BarCodeReadType.Code39Extended)

            While (barcodeReader.Read())

                ' get BarCode text from BarCode image
                Dim code As String = barcodeReader.GetCodeText()
                ' write the BarCode text to Console output
                Console.WriteLine("BARCODE : " + code)

            End While
            ' close BarCodeReader object to release the Image file
            barcodeReader.Close()
        End Using
    Next
Next
For further details on topics covered in this article, please visit the following links
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.