How to store paginated print preview of HTML/MHT

Last post 03-01-2010, 6:19 AM by aspose.notifier. 16 replies.
Page 1 of 2 (17 items)   1 2 Next >
Sort Posts: Previous Next
  •  06-15-2009, 10:55 AM 183922

    How to store paginated print preview of HTML/MHT .NET

    Attachment: Present (inaccessible)

     Hello,

    I've got the trial version in prior to evaluate HTML/WORD rendering capabilities of Aspose.Words.

    What is required is to store such files pagewise, such as Print Preview, takin page dimension and resolution into accout. Can someone, please, show me some simple example how to do accomplish such a task?

    I've already try a given "Preview" example, but rendering of HTML with table seems not to look very pleasant and also an aligment tag's on images will be ignored. Here some simple example:

    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
           "http://www.w3.org/TR/html4/strict.dtd">
    <html>
    <head>
    <title>Kopf, K&ouml;rper und Fu&szlig; einer Tabelle definieren</title>
    </head>
    <body>

    <h1>Betroffene Menschen</h1>

    <img src="c:\\HTMLImage.jpg" align="right"><br clear="all">

    <hr><br>

    <table border="1" rules="groups" border="4" width="100%">
      <thead>
        <tr>
          <th>Assoziation 1</th>
          <th>Assoziation 2</th>
          <th>Assoziation 3</th>
        </tr>
      </thead>
      <tfoot>
        <tr>
          <td><i>betroffen:<br>4 Mio. Menschen</i></td>
          <td><i>betroffen:<br>2 Mio. Menschen</i></td>
          <td><i>betroffen:<br>1 Mio. Menschen</i></td>
        </tr>
      </tfoot>
      <tbody>
        <tr>
          <td>Berlin</td>
          <td>Hamburg</td>
          <td>M&uuml;nchen</td>
        </tr><tr>
          <td>Milj&ouml;h</td>
          <td>Kiez</td>
          <td>Bierdampf</td>
        </tr><tr>
          <td>Buletten</td>
          <td>Frikadellen</td>
          <td>Fleischpflanzerl</td>
        </tr>
        <tr>
          <td>2. Buletten</td>
          <td>2. Frikadellen</td>
          <td>2. Fleischpflanzerl</td>
        </tr>
        <tr>
          <td>3. Buletten</td>
          <td>3. Frikadellen</td>
          <td>3. Fleischpflanzerl</td>
        </tr>
        <tr>
          <td>4. Buletten</td>
          <td>4. Frikadellen</td>
          <td>4. Fleischpflanzerl</td>
        </tr>
      </tbody>
    </table>

    <br>

    <table border="1" frame="box" rules="none" border="10" width="100%">
      <colgroup>
        <col width="4*">
        <col width="2*">
        <col width="1*">
      </colgroup>
      <thead>
        <tr>
          <th>Assoziation 1</th>
          <th>Assoziation 2</th>
          <th>Assoziation 3</th>
        </tr>
      </thead>
      <tfoot>
        <tr>
          <td><i>betroffen:<br>4 Mio. Menschen</i></td>
          <td><i>betroffen:<br>2 Mio. Menschen</i></td>
          <td><i>betroffen:<br>1 Mio. Menschen</i></td>
        </tr>
      </tfoot>
      <tbody>
        <tr>
          <td><font size="3">Berlin</font></td>
          <td>Hamburg</td>
          <td>M&uuml;nchen</td>
        </tr><tr>
          <td>Milj&ouml;h</td>
          <td>Kiez</td>
          <td>Bierdampf</td>
        </tr><tr>
          <td>Buletten</td>
          <td>Frikadellen</td>
          <td>Fleischpflanzerl</td>
        </tr>
        <tr>
          <td>2. Buletten</td>
          <td>2. Frikadellen</td>
          <td>2. Fleischpflanzerl</td>
        </tr>
        <tr>
          <td>3. Buletten</td>
          <td>3. Frikadellen</td>
          <td>3. Fleischpflanzerl</td>
        </tr>
        <tr>
          <td>4. Buletten</td>
          <td colspan="2">4. Frikadellen, Fleischpflanzerl</td>
        </tr>
      </tbody>
    </table>
    <br>
    <img src="c:\\HTMLImage.jpg">

    <hr><br>


    </body>
    </html>

    Many thanks in advice!

    Alexander

     
  •  06-16-2009, 7:06 AM 184047 in reply to 183922

    Re: How to store paginated print preview of HTML/MHT

    Hi

     

    Thanks for your request. The problem occurs because there is colspan in some cells in your table. When you convert HTML to DOC and there table with rowspan or colspan Aspose.Words represents them as merged cells, but each of merged cell contains the same content, this causes the problems during rendering or  converting to PDF. I linked your request to the appropriate issue. You will be notified as soon as it is resolved.

    As a temporary workaround, you can try using the following code:

     

    // Open HTML document

    Document doc = new Document(@"Test199\HTMLTables1.html");

     

    //Get collection of cells in the docuemnt

    NodeCollection cells = doc.GetChildNodes(NodeType.Cell, true);

     

    //Loop through all cells and search for merged cells

    foreach (Cell cell in cells)

    {

        if (cell.CellFormat.HorizontalMerge == CellMerge.Previous ||

            cell.CellFormat.VerticalMerge == CellMerge.Previous)

        {

            //Remove content from merged cells

            cell.RemoveAllChildren();

        }

    }

     

    // Save as image

    doc.SaveToImage(0, doc.PageCount, @"Test199\preview.tif", null);

     

    Hope this helps.

     

    Best regards.


    Alexey Noskov
    Developer/Technical Support
    Aspose Auckland Team
     
  •  06-16-2009, 8:12 AM 184061 in reply to 184047

    Re: How to store paginated print preview of HTML/MHT

      Many thanks for your answer!

    Now I've followed questions:

    1. It's possible to get Image per page in some way like "Image iImage=doc.RenderPage(<PageNumber>)" ?

    2. Which HTML subset will be fully supported, some Table/Cell/Row attributes seems not to supported. Align-tags like <img align="right"> are also not supported as shown in previous example?

    3. How close DOC documents will be rendered to MS WORD and wchich versions of WORD formats are supported?

    4. What is appropriate response time for some HTML & WORD issues for registered customers?

    5. I've read, the MHTL is also supported, but for some .eml mail files (which is close to MHTML) I get Error "Unsupported Content-Type". It's possible to disable such Errors and render all text/lain & text/html parts or it's necessary to convert all parts to standalone HTML files?

     

    Thanks in advice!

     
  •  06-16-2009, 1:11 PM 184125 in reply to 184061

    Re: How to store paginated print preview of HTML/MHT

    Hi

     

    Thanks for your inquiry.

     

    1. Yes of course you can achieve this. You should just use another overload of SaveToImage method. Please see the following link for more information.

    http://www.aspose.com/documentation/file-format-components/aspose.words-for-.net-and-java/aspose.words.document.savetoimage_overload_2.html

     

    2. Not all HTML features are supported now. You can find additional information here:

    http://www.aspose.com/community/files/51/file-format-components/aspose.words-for-.net-and-java/entry108980.aspx

     

    3. It is difficult to say how close Word document will be to HTML, because HTML and Word format are different and it is extremely difficult to produce Word document that looks exactly as HTML and opposite. It is dependent from your documents and their complexity.

    You can find list of supported formats here:

    http://www.aspose.com/documentation/file-format-components/aspose.words-for-.net-and-java/file-formats-and-conversions.html

     

    4. Requests in the forum should be answered within 24 hours. Aspose.Words releases are published every 4-5 weeks and contains fixes of clients issues. When you report the problem, it will be pushed into the queue and we will fix it in one of future releases. Time is dependent from issue complexity and importance.

     

    5. There is no way to disable errors. Please attach your MHTML document here for testing. I will check it and provide you more information.

     

    Best regards.


    Alexey Noskov
    Developer/Technical Support
    Aspose Auckland Team
     
  •  06-17-2009, 3:37 AM 184210 in reply to 184125

    Re: How to store paginated print preview of HTML/MHT

      Thanks for your quick response!

     

    One of question I've not correct formulated: )

     

    ... 3. How close Aspose-Engine will render MS WORD .DOC documents compared to MS WORD and wchich versions of WORD formats are supported?

    This question was not related to HTML.

    Other question is, if it is planned to support the full implementation of HTML 4 in near future?

    Thanks!

     
  •  06-17-2009, 2:35 PM 184329 in reply to 184210

    Re: How to store paginated print preview of HTML/MHT

    Hi

     

    Thanks for your request.

     

    1. Aspose.Words rendering engine renders Word document very close to original, in most of cases rendering image looks exactly as the original documents. If you have any problems with rendering your document, please report them in this forum.

     

    2. Aspose.Words supports DOC format (starting from MS Word 97), DOCX (Word 2007 format), WordML (Word 2003 XML format), RTF, ODT, HTML, EPUB and MHTML (which are based on HTML), and of course export to TXT and PDF.

     

    3. It is extremely difficult to promise you that HTML 4 will be fully supported, because HTML is not actually DOC format, it is web page format, and it is difficult to map between HTML features and Word documents features.

     

    Best regards.


    Alexey Noskov
    Developer/Technical Support
    Aspose Auckland Team
     
  •  07-03-2009, 3:49 AM 186730 in reply to 184329

    Re: How to store paginated print preview of HTML/MHT

    Thanks for yous answer!

    alexey.noskov:
    3. It is extremely difficult to promise you that HTML 4 will be fully supported, because HTML is not actually DOC format, it is web page format, and it is difficult to map between HTML features and Word documents features.

     

    I will try to simplify my question: How close you can support HTML 4 set of tags, particularly for images, alignment, color, font styles, font sizes an table borders/lines/rules?

     

    Thanks in advice.

     
  •  07-03-2009, 6:34 AM 186764 in reply to 186730

    Re: How to store paginated print preview of HTML/MHT

    Hi

     

    Thanks for your request. You can find an approximate list of what is supported in HTML import/export here:

    http://www.aspose.com/community/files/51/file-format-components/aspose.words-for-.net-and-java/entry108980.aspx

     

    But, you should note, this list could be out of date, since we are working on improving out HTML import/export modules.

     

    Best regards.


    Alexey Noskov
    Developer/Technical Support
    Aspose Auckland Team
     
  •  07-03-2009, 7:43 AM 186781 in reply to 186764

    Re: How to store paginated print preview of HTML/MHT

    Attachment: Present (inaccessible)

    Thanks!

     

    alexey.noskov:

    Hi

     

    Thanks for your request. You can find an approximate list of what is supported in HTML import/export here:

    http://www.aspose.com/community/files/51/file-format-components/aspose.words-for-.net-and-java/entry108980.aspx

     

    But, you should note, this list could be out of date, since we are working on improving out HTML import/export modules.

     

    Best regards.

     

    It would be very nice, if you can shortly done some small improvements.

    I've documented the results of A4 page rendering with Aspose.Words (over screenshoots, because the trial version prints only the first site), IE7 and Word 2003 in attached files. IE7 looks better as Word 2003, i.e. for table footers.

    For our requiremnts: at the moment we send self created HTML-mails (becaus of this fact, we can small change the structure at some places, but not overall), receive the user completed HTML-mails back and archive this documents. Later we must can to produce page images from any HTML mails. At the same moments we must produce images from some numbers of formats such as .DOC.

    It would be also very pleasant, if we can render HTML-Mails with embedded images (as MHT oder EML) without conversion to HTML files. It is possible?

     

    Thanks in advice!

     
  •  07-03-2009, 8:29 AM 186796 in reply to 186781

    Re: How to store paginated print preview of HTML/MHT

    Hi

     

    Thank you for additional information.

     

    1. You can improve conversion a little by using workaround I suggested earlier. As I can see you did not use it to convert your HTML to TIF.

     

    Document doc = new Document(@"Test001\T.html");

    RemoveContentFromMergedCells(doc);

    doc.SaveToImage(0, doc.PageCount, @"Test001\out.tif", null);

     

    ===================================================================

     

    /// <summary>

    /// Remove content from merged cells.

    /// </summary>

    public void RemoveContentFromMergedCells(Document doc)

    {

        // Remove content from merged cells.

        // Get collection of cells in the docuemnt.

        NodeCollection cells = doc.GetChildNodes(NodeType.Cell, true);

        foreach (Cell cell in cells)

        {

            // Check whether cell is merged with previouse.

            if (cell.CellFormat.HorizontalMerge == CellMerge.Previous ||

                cell.CellFormat.VerticalMerge == CellMerge.Previous)

            {

                // Remove content from the cell.

                cell.RemoveAllChildren();

            }

        }

    }

     

    2. Aspose.Words supports MHTML format, so you can open MHTML file directly, without converting them to HTML.

     

    Best regards.


    Alexey Noskov
    Developer/Technical Support
    Aspose Auckland Team
     
  •  07-08-2009, 4:33 AM 187438 in reply to 186796

    Re: How to store paginated print preview of HTML/MHT

    Hi!

    many thanks for your help! This solved the cell-merging rendering problem.

    Now I interested is solving some issues i.e. with borders, rules, alignment, line breaking inside headers, table headers and footers, cell padding, etc as showing in screenshoots in attached archive file.

    alexey.noskov:
    But, you should note, this list could be out of date, since we are working on improving out HTML import/export modules.

    Can you, please, supply information, what is on HTML todo improvement-list and planned deadline for?

    Thanks in advice!

     
  •  07-08-2009, 4:56 AM 187446 in reply to 187438

    Re: How to store paginated print preview of HTML/MHT

    Hi

     

    Thanks for your inquiry. As I told you earlier it is extremely difficult and often impossible to produce Word documents (and documents preview) which looks exactly as source HTML. This is because HTML documents are not Word documents, HTML is one page document and this format is designed for Web and not for displaying on pages.

    Unfortunately, currently, I cannot provide you list of planed improvements and estimates.

     

    Best regards.


    Alexey Noskov
    Developer/Technical Support
    Aspose Auckland Team
     
  •  07-08-2009, 5:54 AM 187467 in reply to 187446

    Re: How to store paginated print preview of HTML/MHT

    Thanks for quick response!

    It's not planned to produce Word documents from HTML documents at all :)

    Only the print or store as images, i.o.w. paginated rendering of HTML.

    As for example you product already perfectly support the paginated preview and print of HTML, but with only wih some small restriction/missed features. As I can see in the table, you support alignment for paragraph, but not for image; all rules in the table are allways rendered within same width, but sure it can be rendered without some (or all) rules (as defined in the HTML-Table/Row options); the same is within footers or headers of table, which must allways be on bottom or top of table; unwanted text breaking inside the header may be only small bug - so I think, it's only a small subset of missed features to accomplish most of requirements of this task, other missed features is not of strong hight importance for rendering.

    Thanks!

     
  •  07-08-2009, 8:44 AM 187515 in reply to 187467

    Re: How to store paginated print preview of HTML/MHT

    Hi

     

    Thank you for additional information. Maybe you can try some post-processing. You can create Document from HTML and then format Document’s elements as needed. If you need help with this, I can try to help you. But you should note, that workarounds can work for one document, but will not work for other documents.

     

    Best regards.


    Alexey Noskov
    Developer/Technical Support
    Aspose Auckland Team
     
  •  07-23-2009, 7:51 AM 189925 in reply to 187515

    Re: How to store paginated print preview of HTML/MHT

    Thanks!

    You are right, of course we can do some post-processing for our own documents, or replies if not much modofications are done.

    But now we have reqirements to process fully undefinded user created incommind mail - for such scenario we can apply no postprocessing.

    By the way, you have mentioned, it's possible to process single page archive, but I'v got an axception if I try to open such file: http://www.aspose.com/community/forums/189924/unexpected-error-occured-if-opened-mht-single-webpage-document/showthread.aspx#189924

    Thanks in advice!

     
Page 1 of 2 (17 items)   1 2 Next >
View as RSS news feed in XML