HTML to RTF - Saving images

Last post 11-08-2011, 2:47 PM by alexey.noskov. 14 replies.
Sort Posts: Previous Next
  •  01-19-2011, 2:31 PM 279796

    HTML to RTF - Saving images

    I have HTML with images that I want to convert to RTF.  When I save the stream in RTF format, I get everything but the images.

    Here is my code:
            Dim HTML As Byte() = System.Text.Encoding.GetEncoding("iso-8859-1").GetBytes(oEditor.Html)
            Dim memoryStream As System.IO.MemoryStream = New System.IO.MemoryStream(HTML)

            Dim doc As New Document(memoryStream)

            Dim options As New RtfSaveOptions()
            Dim dstStream As New MemoryStream()
            doc.Save(dstStream, options)
            doc.Save("c:\temp\rtf\saved-from-ae.rtf", options)


    This message was posted using Page2Forum from html to rtf image - Aspose Search Results - Aspose.com
    Tim H
     
  •  01-19-2011, 7:08 PM 279824 in reply to 279796

    Re: HTML to RTF - Saving images

    Hi Tim,

    Thanks for your inquiry.

    Could you please attach your HTML document here and we will glaly provide you some further feedback.

    Thanks,


    Adam Skelton
    Programming Writer
    Aspose Auckland Team
     
  •  01-21-2011, 8:58 AM 280245 in reply to 279824

    Re: HTML to RTF - Saving images

    Hi,

     

    Here is the HTML code (kind-of ugly) which contains a <img> referencing a file on disk:

    <html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><meta http-equiv="Content-Style-Type" content="text/css" /><meta name="generator" content="Aspose.Words for .NET 9.6.0.0" /><title></title></head><body><div><p style="margin:0pt"><span style="color:#ff0000; font-family:Cambria; font-size:12pt; font-weight:bold">Evaluation Only. Created with Aspose.Words. Copyright 2003-2010 Aspose Pty Ltd.</span></p><h2 style="font-weight:normal; margin:10pt 0pt 0pt; page-break-after:avoid; page-break-inside:avoid"><span style="color:#4f81bd; font-family:Cambria; font-size:13pt; font-weight:bold">Testing that the Applications are Prepared for HTTPS Communication</span></h2><p style="margin:0pt 0pt 10pt 36pt"><span style="color:#000000; font-family:'Times New Roman'; font-size:12pt; font-weight:normal">Replace “yourdomain.com” with the domain name that the SSL certificate was issued for.</span></p><p style="margin:0pt 0pt 10pt 36pt"><img src="/Adept8.3/tmp/ADM/Aspose.Words.3173c47b-e356-42c3-9ee9-240f37134dc6.001.png" width="290" height="174" alt="" /></p><p style="margin:0pt 0pt 0pt 72pt; text-indent:-18pt"><span style="color:#7030a0; font-family:'Courier New'; font-size:12pt; font-weight:normal">o</span><span style="color:#7030a0; font-family:'Courier New'; font-size:12pt; font-weight:normal">              </span><span style="color:#7030a0; font-family:'Times New Roman'; font-size:12pt; font-weight:normal">TEST: GlassFish</span><br /><span style="color:#0000ff; font-family:'Times New Roman'; font-size:12pt; font-weight:normal; text-decoration:underline">https://yourdomain.com:8181/wsclient/servlet/DMS &lt;https://localhost:8181/wsclient/servlet/DMS&gt;</span><span style="color:#000000; font-family:'Times New Roman'; font-size:12pt; font-weight:normal; text-decoration:none"> - The page should load without error and the certificate should not cause a warning.</span></p><p style="margin:0pt 0pt 0pt 72pt; text-indent:-18pt"><span style="color:#7030a0; font-family:'Courier New'; font-size:12pt; font-weight:normal; text-decoration:none">o</span><span style="color:#7030a0; font-family:'Courier New'; font-size:12pt; font-weight:normal; text-decoration:none">              </span><span style="color:#7030a0; font-family:'Times New Roman'; font-size:12pt; font-weight:normal; text-decoration:none">TEST: GlassFish</span><br /><span style="color:#0000ff; font-family:'Times New Roman'; font-size:12pt; font-weight:normal; text-decoration:underline">https://yourdomain.com:8181/wsclient/servlet/VueServlet &lt;https://localhost:8181/wsclient/servlet/VueServlet&gt;</span><span style="color:#000000; font-family:'Times New Roman'; font-size:12pt; font-weight:normal; text-decoration:none"> - The page should load without error and the certificate should not cause a warning.</span></p><p style="margin:0pt 0pt 0pt 72pt; text-indent:-18pt"><span style="color:#7030a0; font-family:'Courier New'; font-size:12pt; font-weight:normal; text-decoration:none">o</span><span style="color:#7030a0; font-family:'Courier New'; font-size:12pt; font-weight:normal; text-decoration:none">              </span><span style="color:#7030a0; font-family:'Times New Roman'; font-size:12pt; font-weight:normal; text-decoration:none">TEST: Adept Web Services</span><br /><span style="color:#0000ff; font-family:'Times New Roman'; font-size:12pt; font-weight:normal; text-decoration:underline">https://yourdomain.com/Adept/BluePrintWebService/BPWS.asmx?op=getDmsConfig &lt;https://localhost/Adept/BluePrintWebService/BPWS.asmx?op=getDmsConfig&gt;</span><span style="color:#000000; font-family:'Times New Roman'; font-size:12pt; font-weight:normal; text-decoration:none"> - The page should load without error and the certificate should not cause a warning. </span></p><p style="margin:0pt 0pt 0pt 72pt; text-indent:-18pt"><span style="color:#7030a0; font-family:'Courier New'; font-size:12pt; font-weight:normal; text-decoration:none">o</span><span style="color:#7030a0; font-family:'Courier New'; font-size:12pt; font-weight:normal; text-decoration:none">              </span><span style="color:#7030a0; font-family:'Times New Roman'; font-size:12pt; font-weight:normal; text-decoration:none">TEST: Adept Web Services</span><br /><span style="color:#0000ff; font-family:'Times New Roman'; font-size:12pt; font-weight:normal; text-decoration:underline">https://yourdomain.com/Adept/Service_Adept.asmx?op=HelloWorld &lt;https://localhost/Adept/Service_Adept.asmx?op=HelloWorld&gt;</span><span style="color:#000000; font-family:'Times New Roman'; font-size:12pt; font-weight:normal; text-decoration:none"> </span></p><p style="margin:0pt 0pt 10pt 108pt; text-indent:-18pt"><span style="color:#000000; font-family:Wingdings; font-size:12pt; font-weight:normal; text-decoration:none">§</span><span style="color:#000000; font-family:Wingdings; font-size:12pt; font-weight:normal; text-decoration:none">              </span><span style="color:#000000; font-family:'Times New Roman'; font-size:12pt; font-weight:normal; text-decoration:none">Login with AE and View a file.</span></p><p style="margin:0pt"><span style="color:#7030a0; font-family:'Times New Roman'; font-size:12pt; font-weight:normal; text-decoration:none">TEST: AutoVue.  This test will not load a file, but will test whether the viewer has been configured properly.</span><span style="color:#000000; font-family:'Times New Roman'; font-size:12pt; font-weight:normal; text-decoration:none"> </span></p></div></body></html>


    Tim H
     
  •  01-21-2011, 11:07 AM 280291 in reply to 280245

    Re: HTML to RTF - Saving images

    Hi

     

    Thanks for your request. I suppose the problem occurs because Apsose.Words cannot find the image in the specified location. Moreover, path to image is relative. Have you tried to specify full path to image? This should fix the problem.

     

    Best regards,


    Alexey Noskov
    Developer/Technical Support
    Aspose Auckland Team
     
  •  01-21-2011, 1:26 PM 280326 in reply to 280291

    Re: HTML to RTF - Saving images

    Let me give more background on what I'm testing.

    I have a small RTF document stored in a database (it's raw text is stored in a varchar column in a table).

    Using your component, I'm converting the RTF to HTML, so I can display it in an HTML editor control:

    Dim sMemo As String = docrec.GetTextMemoValue()

    Dim RTF As Byte() = System.Text.Encoding.GetEncoding("iso-8859-1").GetBytes(docrec.GetTextMemoValue())

    Dim memoryStream As System.IO.MemoryStream = New System.IO.MemoryStream(RTF)

    Dim doc As New Document(memoryStream)

    ' Create and pass the object which implements the handler methods.

    Dim options As New HtmlSaveOptions(SaveFormat.Html)

    options.ExportTextInputFormFieldAsText = True

    options.ImagesFolder = "C:\inetpub\wwwroot\Adept8.3\tmp\ADM"

    options.ImagesFolderAlias = "/Adept8.3/tmp/ADM/"

    Dim dstStream As New MemoryStream()

    doc.Save(dstStream, options)

    Dim pos = dstStream.Position

    dstStream.Position = 0

    Dim reader As New StreamReader(dstStream)

    Dim str = reader.ReadToEnd()

    oEditor.Html = str

    oEditor.ID = "MemoField"

    The user can edit the text in the HTML editor and when they click "Save" I need to convert the HTML back to RTF:

    Dim HTML As Byte() = System.Text.Encoding.GetEncoding("iso-8859-1").GetBytes(oEditor.Html)

    Dim memoryStream As System.IO.MemoryStream = New System.IO.MemoryStream(HTML)

    Dim doc As New Document(memoryStream)

    Dim options As New RtfSaveOptions()

    Dim dstStream As New MemoryStream()

    doc.Save(dstStream, options)

    doc.Save("c:\temp\rtf\saved-from-ae.rtf", options)

     

    The problem is that the images are saved to disk by your component:

     

    options.ImagesFolder = "C:\inetpub\wwwroot\Adept8.3\tmp\ADM"

    options.ImagesFolderAlias = "/Adept8.3/tmp/ADM/"

    If I change the path of the imageFolderAlias to:

    options.ImagesFolderAlias = "http://localhost/Adept8.3/tmp/ADM/"

    I'm could get burned down the road, because I cannot guarantee that the image path is http://localhost/.  It could be a fully-qualified domain name or it could be https. 

     

    Is there a way to specify the directory where images are located?


    Tim H
     
  •  01-21-2011, 4:47 PM 280351 in reply to 280326

    Re: HTML to RTF - Saving images

    Hi Tim,

    Thanks for your inquiry.

    I think you need to set the BaseUri in the LoadOptions when loading the HTML document. Please see the API page here for details.

    Thanks,


    Adam Skelton
    Programming Writer
    Aspose Auckland Team
     
  •  11-07-2011, 8:37 AM 340580 in reply to 280351

    Re: HTML to RTF - Saving images

    Working with the Aspose.Words component to convert HTML to RTF.  Local images are not converting properly.  The image is located physically in c:\inetpub\wwwroot.  The resulting RTF shows a broken image.  Please advise.

     

            string sHtml = "<img src=\"/help.jpg\" alt=\"\" />";

     

            string sText = string.Empty;

            string sBaseUrl = null;

            System.Web.HttpRequest oRequester = HttpContext.Current.Request;

     

            #region Convert HTML to RTF with Aspose.Words

            byte[] HTML = System.Text.Encoding.GetEncoding("iso-8859-1").GetBytes(sHtml);

            System.IO.MemoryStream memoryStream = new System.IO.MemoryStream(HTML);

            LoadOptions loadOptions = new LoadOptions(Aspose.Words.LoadFormat.Html, "", "http://localhost/");

            Document doc = new Document(memoryStream, loadOptions);

            RtfSaveOptions options = new RtfSaveOptions();

            MemoryStream dstStream = new MemoryStream();

            doc.Save(dstStream, options);

            dstStream.Position = 0;

     

            StreamReader reader = new StreamReader(dstStream);

            sText = reader.ReadToEnd();

    /*

     * {\rtf1\ansi\ansicpg1252\uc0\stshfdbch0\stshfloch0\stshfhich0\stshfbi0\deff0\adeff0{\fonttbl{\f0\fnil\fcharset0 Times New Roman;}}{\colortbl;}{\stylesheet{\s0\snext0\styrsid8412110\sqformat\spriority0\ltrpar\li0\lin0\ri0\rin0\ql\faauto\rtlch\afs24\ltrch\fs24 Normal;}{\*\cs10\additive\ssemihidden\spriority0 Default Paragraph Font;}}{\*\generator Aspose.Words for .NET 9.6.0.0;}{\info\version0\edmins0\nofpages0\nofwords0\nofchars0\nofcharsws0}\deflang1033\deflangfe2052\adeflang1025\jexpand\showxmlerrors1\validatexml1\viewscale100\fet0\widowctrl\nospaceforul\nolnhtadjtbl\alntblind\lyttblrtgr\nogrowautofit\dntblnsbdb\noxlattoyen\wrppunct\nobrkwrptbl\expshrtn\snaptogridincell\asianbrkrule\htmautsp\noultrlspc\useltbaln\splytwnine\ftnlytwnine\lytcalctblwd\allowfieldendsel\newtblstyruls\lnbrkrule\formshade\nojkernpunct\dghspace180\dgvspace180\dghorigin1800\dgvorigin1440\dghshow1\dgvshow1\dgmargin\pgbrdrhead\pgbrdrfoot\sectd\ltrsect\sectdefaultcl\pard\plain\itap0\s0\ltrpar\li0\lin0\ri0\rin0\ql\faauto\rtlch\afs24\ltrch\fs24{\rtlch\afs24\ltrch\fs24{\*\shppict{\pict{\*\picprop\shplid1025{\sp{\sn fLayoutInCell}{\sv 1}}{\sp{\sn posrelh}{\sv 2}}{\sp{\sn posrelv}{\sv 2}}{\sp{\sn shapeType}{\sv 75}}}\pngblip\picw847\pich847\picwgoal480\pichgoal480\picscalex100\picscaley100\piccropl0\piccropr0\piccropt0\piccropb0\bliptag1766401221{\*\blipuid 694924c50c2e81711f287ad60b8ef0e5}89504e470d0a1a0a0000000d494844520000002000000020080300000044a48ac600000300504c5445000000ffffff808080c0c0c0ff0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000d84498c0000000097048597300000ec300000ec301c76fa8640000004249444154789c63602400188699022614804d01923666868153c0c202c1b84d606101c9e3b3022c4f810984dd40c817b456c08c04b0296040015814e000a30a20000034d404691ed77d330000000049454e44ae426082}}{\nonshppict{\pict\pngblip\picw847\pich847\picwgoal480\pichgoal480\picscalex100\picscaley100\piccropl0\piccropr0\piccropt0\piccropb0\bliptag1766401221{\*\blipuid 694924c50c2e81711f287ad60b8ef0e5}89504e470d0a1a0a0000000d494844520000002000000020080300000044a48ac600000300504c5445000000ffffff808080c0c0c0ff0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000d84498c0000000097048597300000ec300000ec301c76fa8640000004249444154789c63602400188699022614804d01923666868153c0c202c1b84d606101c9e3b3022c4f810984dd40c817b456c08c04b0296040015814e000a30a20000034d404691ed77d330000000049454e44ae426082}}}{\rtlch\afs24\ltrch\fs24\par}{\*\latentstyles\lsdstimax267\lsdlockeddef0\lsdsemihiddendef1\lsdunhideuseddef1\lsdqformatdef0\lsdprioritydef99{\lsdlockedexcept}}}

     * /

            #endregion

     

     


    Tim H
     
  •  11-07-2011, 9:31 AM 340589 in reply to 340580

    Re: HTML to RTF - Saving images

    Hello

     

    Thanks for your inquiry. Could you please attach your input HTML and output RTF here for testing? I’ll check the problem on my side and provide you more information.

     

    Best regards,


    Andrey Noskov
    Developer/Technical Support
    Aspose Auckland Team
     
  •  11-07-2011, 9:45 AM 340592 in reply to 340589

    Re: HTML to RTF - Saving images

    Both are already posted.  The html is in a string (very simple) and the resulting RTF is in a comment at the end.
    Tim H
     
  •  11-07-2011, 9:51 AM 340594 in reply to 340592

    Re: HTML to RTF - Saving images

    Hello

     

    Thank you for additional information. I cannot reproduce the problem on my side using the latest version of Aspose.Words (10.6.0). Here are my steps:

    1.      I have created virtual directory on my machine.

    2.      Then I have added images to this directory

    3.      And then I have converted HTML to RTF.

     

    Moreover I have tried using the following code, and it works fine too:

     

    Document doc = new Document();

     

    DocumentBuilder builder = new DocumentBuilder(doc);

     

    string baseUri  = "<base href='http://localhost'/>";

    builder.InsertHtml(baseUri + "<img alt='' src='../images/test.jpg/>");

     

    doc.Save("C:\\Temp\\out.rtf");

     

    Best regards, 


    Andrey Noskov
    Developer/Technical Support
    Aspose Auckland Team
     
  •  11-07-2011, 1:41 PM 340643 in reply to 340594

    Re: HTML to RTF - Saving images

    I found that your sample code DOES work (though it has a typo), but I am using a Web Application that has explicit permissions, which is quite different than a virtual directory.

    I'm confident that this is a permissions issue, as images from other domains and virtual directories do work, but images in my web application do not. 

    Can you please help resolve this?

    Quite simply, the virtual directory is converted to a web app, with the ASP.Net v4.0 Classic App Pool, and a set user using Windows authentication.

     


    Tim H
     
  •  11-08-2011, 1:18 AM 340706 in reply to 340643

    Re: HTML to RTF - Saving images

    Hi

     

    Thank you for additional information. I think, you can use the approach suggested here to work the problem around:

    http://www.aspose.com/community/forums/post/326743/using-aspose.words-for-java-with-https-v3.aspx

     

    Hope this helps.

     

    Best regards,


    Alexey Noskov
    Developer/Technical Support
    Aspose Auckland Team
     
  •  11-08-2011, 9:53 AM 340821 in reply to 340706

    Re: HTML to RTF - Saving images

    Thank you for the suggestion; however, it doesn't seem to work: the regular expression doesn't find any of the image tags in my HTML.

    I'm not great at regex and I tried a couple of variations:

    <img[^>]+src[\\s='\"]+([^\"'>\\s]+)/is

    "<img\\s+src\\s*=\\s*[\"']([^\"']+)[\"']\\s*/*>"

     

    So, with the issue at hand, are you saying that there is a known problem with websites that use authentication?

    If I get your sample working, won't I have to manage each of the image types, e.g. gif, png, jpg, bmp, etc.?  Why doesn't the component handle this automatically?


    Tim H
     
  •  11-08-2011, 1:51 PM 340871 in reply to 340821

    Re: HTML to RTF - Saving images

    Ok, forget my last post - I figured out how to make it work using some of your suggestions.

    Let's consider this resolved.


    Tim H
     
  •  11-08-2011, 2:47 PM 340887 in reply to 340871

    Re: HTML to RTF - Saving images

    Hi 

     

    It is perfect that you managed to resolve the problem. Please let us know if you need more assistance, we will be glad to help you.

     

    Best regards,


    Alexey Noskov
    Developer/Technical Support
    Aspose Auckland Team
     
View as RSS news feed in XML