Performance and File Size issue with Aspose PDF 6.2

Last post 02-26-2012, 12:50 PM by codewarior. 27 replies.
Page 2 of 2 (28 items)   < Previous 1 2
Sort Posts: Previous Next
  •  02-10-2012, 2:33 PM 360825 in reply to 360656

    Re: Performance and File Size issue with Aspose PDF 6.2

    Hi

    i just tested out the latest Aspose.PDF 6.7 and the file concat performance has increased but the file size hasn't changed much. the 6000 page pdf is around 70mb.

    the issue is still not fixed.

    Thanks

     
  •  02-16-2012, 5:03 AM 362025 in reply to 360825

    Re: Performance and File Size issue with Aspose PDF 6.2

    Hello Ujjwal,

    Thanks for your acknowledgment and sorry for the delay in response.

    Please note that in recent release version, the concatenation process is improved. (PageTreeNode was reworked to avoid recreating pages array). In my earlier attempt, I used the following code snippet to reproduce the issue where source Sample+Document1.pdf and Sample+Document1 - Copy.pdf file size is 80KB. After the execution of code, the resultant Merged-output.pdf of size 161MB was generated.

    [C#]

    //open first document
    Aspose.Pdf.Document pdfDocument1 = new Aspose.Pdf.Document("d:/pdftest/Sample+Document1.pdf"
    );
    for (int
    i = 0; i <= 2000; i++)
    {
     
    //open second document
     
    Aspose.Pdf.Document pdfDocument2 = new Aspose.Pdf.Document("d:/pdftest/Sample+Document1 - Copy.pdf"
    );
     
    //add pages of second document to the first
     
    pdfDocument1.Pages.Add(pdfDocument2.Pages);
    }
    //save concatenated output file
    pdfDocument1.Save("d:/pdftest/Merged-output.pdf");

    However in order to get better performance and minimized file size, we have made some changes to the code and after execution, the resultant PDF of size 8MB is generated. The size of the document is remarkably reduced from 161MB to 8MB.

    [C#]

    Aspose.Pdf.Document pdfDocument1 = new Aspose.Pdf.Document("d:/pdftest/Sample+Document1.pdf");
    //open added document here
    Aspose.Pdf.Document pdfDocument2 = new Aspose.Pdf.Document("d:/pdftest/Sample+Document1 - Copy.pdf"
    );
    for (int
    i = 0; i <= 2000; i++)
    {
     
    //add pages of second document to the first
     
    pdfDocument1.Pages.Add(pdfDocument2.Pages);
    }
    //save concatenated output file
    pdfDocument1.Save("d:/pdftest/Merged-output.pdf");

    Nevertheless, if you are still getting the large resultant PDF file, please share the source files and code snippet that you are using so that we can test the scenario at our end. We are really sorry for this inconvenience.


    Nayyer Shahbaz
    Support Developer, Aspose Sialkot Team
    About Us
    Contact Us

    Keep in touch! We're on Twitter and Facebook
     
  •  02-16-2012, 6:34 PM 362217 in reply to 362025

    Re: Performance and File Size issue with Aspose PDF 6.2

    Hi Nayyer

    This is what i tried and the file size is still big. I have 1000 pdf files in the Temp Folder each around 80kb. you should be able to recreate this issue. I am using aspose word 10.7 and Aspose pdf 6.7.

    For i = 0 To 1000 Step 1

    If Not New FileInfo(Environment.CurrentDirectory & "\Temp\" & i & ".pdf").Exists Then

    If New DirectoryInfo(Environment.CurrentDirectory).GetFiles("*.docx").Length > 0 Then

    For Each FileInfo As FileInfo In New DirectoryInfo(Environment.CurrentDirectory).GetFiles("*.docx")

    Dim doc As New Aspose.Words.Document(FileInfo.FullName)

    If Not New DirectoryInfo(Environment.CurrentDirectory & "\Temp").Exists Then

    Directory.CreateDirectory(Environment.CurrentDirectory & "\Temp")

    End If

    doc.Save(Environment.CurrentDirectory & "\Temp\" & i & ".pdf")

    Next

    End If

    End If

    Next

    '#New

    i = 0

    Dim FirstDocument As Aspose.Pdf.Document = Nothing

    Dim NextDocument As Aspose.Pdf.Document = Nothing

    For Each File As FileInfo In New DirectoryInfo(Environment.CurrentDirectory & "\Temp").GetFiles

    If FirstDocument Is Nothing And i = 0 Then

    FirstDocument = New Aspose.Pdf.Document(File.FullName)

    ElseIf i > 0 Then

    NextDocument = New Aspose.Pdf.Document(File.FullName)

    FirstDocument.Pages.Add(NextDocument.Pages)

    NextDocument = Nothing

    GC.Collect()

    End If

    i += 1

    Next

    FirstDocument.Save(Environment.CurrentDirectory & "\Output.pdf")

     
  •  02-16-2012, 6:36 PM 362219 in reply to 362217

    Re: Performance and File Size issue with Aspose PDF 6.2

    you can find the sample word document attached to this forum.
     
  •  02-18-2012, 12:13 PM 362531 in reply to 362217

    Re: Performance and File Size issue with Aspose PDF 6.2

    Hello Ujjwal,

    Thanks for sharing the code snippet.

    I have tested the scenario where I have first converted the .docx file into PDF format using Aspose.Words for .NET 10.8.0 and resultant PDF of size 80KB is generated. Then I have create a copy of resultant PDF file and have used the same code snippet that I have shared in my earlier post. The loop iterated for 1000 times and resultant PDF of size 8MB is created in 36 seconds. I have tested the scenario over Intel Dual core 1.8GHz with 2.5GB of RAM using Windows XP 32bit.

    Besides this, I have also tried using the code snippet that you have shared but I am afraid I am getting InvalidPdfFileFormatException 'Startxref not found' when pages of NextDocument are being added to FirstDocument. However my understanding is that the reason for increase in size of resultant PDF is that you are creating a new instance of NextDocument for each document present in Temp folder. I think every time you need to create a new instance of NextDocument because the PDF documents in Temp folder might be different rather than merging copies of single PDF file. We will further look into the details of this problem and will keep you posted with the status of correction. We are really sorry for this inconvenience.

    PS, In my attempt, I created the copy of single file and have loaded it once before entering the loop and due to this reason, the size of resultant file is less.

    Nayyer Shahbaz
    Support Developer, Aspose Sialkot Team
    About Us
    Contact Us

    Keep in touch! We're on Twitter and Facebook
     
  •  02-18-2012, 3:25 PM 362535 in reply to 362531

    Re: Performance and File Size issue with Aspose PDF 6.2

    Hi Nayyer
    The goal is to be able to merge 1000+ pdf document into one pdf, each pdf document are different and can be of any format and pages. That is reason why my for loop has a new instance of pdf document for every document in temp folder.

    Also while you are at it, can you also look into the concatenate and append method in pdffileeditor class they too generate huge resultant pdf.

    Please share the update or ETA on this fix. i have been waiting for this fix for a long time.

    Appreciate it.

     
  •  02-18-2012, 5:09 PM 362540 in reply to 362535

    Re: Performance and File Size issue with Aspose PDF 6.2

    ubshrest:
    The goal is to be able to merge 1000+ pdf document into one pdf, each pdf document are different and can be of any format and pages. That is reason why my for loop has a new instance of pdf document for every document in temp folder.
    Hello Ujjwal,

    Please note that I have already intimated the development team to again investigate this issue from this perspective.

    ubshrest:
    Also while you are at it, can you also look into the concatenate and append method in pdffileeditor class they too generate huge resultant pdf..

    I have tested the scenario where I have tried concatenating 85.7KB file with 1000 instances of its copy using PdfFileEditor.Concatenate method and as per my observations, Aspose.Pdf for .NET 6.7.0 is generating a resultant PDF of 82.6MB. I have used the following code snippet to test the scenario. Can you please share the code snippet that you are using so that we can test it in our environment.

    [C#]

    //create PdfFileEditor object
    PdfFileEditor pdfEditor = new PdfFileEditor();
    //array of files
    string[] filesArray = new string[1001];
    filesArray[0] = "D:\\pdftest\\tempfolder\\input.pdf";
    for (int i = 1; i <= 1000; i++)
    {
      filesArray[i] = "D:\\pdftest\\tempfolder\\Copy of input.pdf";
    }
    //into a single output file
    pdfEditor.Concatenate(filesArray, "D:\\pdftest\\tempfolder\\Finaloutput.pdf");

    ubshrest:
    Please share the update or ETA on this fix. i have been waiting for this fix for a long time.
    The development team is again investigating this issue and soon you will be updated with the status of correction. Please be patient and spare us little time.We are really sorry for this inconvenience.


    Nayyer Shahbaz
    Support Developer, Aspose Sialkot Team
    About Us
    Contact Us

    Keep in touch! We're on Twitter and Facebook
     
  •  02-18-2012, 5:25 PM 362541 in reply to 362540

    Re: Performance and File Size issue with Aspose PDF 6.2

    Thanks Nayyer

    For the concatenate, i used the same code that you used. And it generates 80MB file which is very huge considering its only 6000page pdf.

    Thanks

     
  •  02-18-2012, 8:42 PM 362543 in reply to 362541

    Re: Performance and File Size issue with Aspose PDF 6.2 .NET

    Attachment: Present (inaccessible)
    Hi Nayyer,
    I have attached a sample project, here you will find 2 ways of generating merge pdf. Once is appending each sample word document as one huge word document then converting to a pdf file. Another way i used both Word and Pdf to create one pdf document that is convert each word document to pdf then merge all pdf to one pdf document. here you will see that the 2 generated pdf file has huge difference in file size, so that is the reason i think there is huge room of improvement to reduce the pdf file created by aspose pdf. May be i am wrong and is not possible but do research and share your finding.
     
  •  02-19-2012, 3:55 AM 362563 in reply to 362543

    Re: Performance and File Size issue with Aspose PDF 6.2

    Hello Ujjwal,

    Thanks for sharing the sample application.

    I have tested the scenario while using Aspose.Words for .NET 10.8.0 and Aspose.Pdf for .NET 6.7.0 and as per my observations, the PDF file generated using Aspose.Words is 263KB and the resultant PDF being generated with Aspose.Pdf for .NET is 845KB.  Please note that both products have their own mechanism/technique of generating the PDF files so there is a difference in resultant PDF file size. Also Aspose.Words for .NET is simply rendering merged word file into PDF format through its rendering engine whereas Aspose.Pdf for .NET is combining pages of 10 individual PDF files (each document is 85.7KB) into a single resultant document. However I will further discuss this scenario with development team to see if we can reduce the size of PDF file being generated.

    Your patience and comprehension is greatly appreciated in this regard. We are really sorry for this inconvenience. 

    Nayyer Shahbaz
    Support Developer, Aspose Sialkot Team
    About Us
    Contact Us

    Keep in touch! We're on Twitter and Facebook
     
  •  02-23-2012, 5:45 PM 363674 in reply to 362543

    Re: Performance and File Size issue with Aspose PDF 6.2

    Hello Ujjwal,

    Thanks for your patience.

    Our development team has spent sometime to further investigate the reasons of this problem and following are our observations. 

    The first sample file which you have shared has size ~80 Kb. It has 12 pages and contents of each of these pages is about 2Kb; also this file contains 3 fonts in resources with sizes: 30Kb, 16Kb, and 3Kb (these are sizes of compressed objects). 
    Please note that all these objects must be included into concatenated file. If you concatenate a file with 80k size for 2000 times, you will get resultant file whose size would be 2000 * 80Kb = 160000 Kb = 160 MB.
    Currently we are not entirely certain that the size can be significantly reduced. (If we treat this as the same file as in sample we can reduce size if we share some objects, for example fonts for all copies of file contents in resultant file; but this will not work for different files!)

    Concerning to the second example of using 10 files, please note that the file created with Aspose.Words contains 4 fonts which are used by all document pages. However when converting the individual word file into PDF format and then concatenating these files,  fonts are included for individual 10 documents as separate objects. This causes differences in sizes.

    Nayyer Shahbaz
    Support Developer, Aspose Sialkot Team
    About Us
    Contact Us

    Keep in touch! We're on Twitter and Facebook
     
  •  02-24-2012, 3:16 PM 363951 in reply to 363674

    Re: Performance and File Size issue with Aspose PDF 6.2

    Thanks Nayyer for researching this in detail. If possible can you check if Aspose.pdf.kit does the same thing, i don't have that dll anymore to test this scenario.

    can you please leave this issue open so that your team can research this and see if you can reduce this file size and compress it more for future.

    Again, i appreciate the time you looking into this. Thanks

     

     
  •  02-26-2012, 12:50 PM 364050 in reply to 363951

    Re: Performance and File Size issue with Aspose PDF 6.2

    Hello Ujjwal,

    I have also tested the scenario where I have tried concatenating 2000 copies of 80KB source Sample+Document1.pdf file and as per my observations with Aspose.Pdf.Kit for .NET 6.0.0, the process took 5 minutes and 38 seconds to generate a resultant 67.7MB file. In another attempt, I have used Aspose.Pdf.Facades to perform the similar task, and the process took around 1 minute and 23 seconds and the resultant file of size 153MB is generated. Product version of Aspose.Pdf for .NET is 6.7.0.

    We will definitely consider these findings during the resolution of this problem. Please note that I have already re-opened this issue and as soon as we have made some significant improvement towards the resolution of this problem, we would be more than happy to update you with the status of correction. Please be patient and spare us little time.

    Nayyer Shahbaz
    Support Developer, Aspose Sialkot Team
    About Us
    Contact Us

    Keep in touch! We're on Twitter and Facebook
     
Page 2 of 2 (28 items)   < Previous 1 2
View as RSS news feed in XML