Preserving mergefield in html format

Hi Aspose Team!

Is there a way to preserve mergefield information when the document is saved as html?
So that, after saving it as html, I can do a mailmerge?

Here is my scenario.
On my web site, my word users can submit template that I mailmerge to produce html report.
Some of my users are proficient with html, and are asking if they can edit the outputed html, so that they can control the outputed html.
But after they edit the html, I still need to be able to do a mailmerge.

Is it possible?

Hi

You can try using the following code as workaround. This code replaces «FieldName» text with mergefield.

Hashtable mergeFields = new Hashtable();

public void TestMailMergehtml_106338()
{
    //Convert doc to html
    Document doc = new Document(@"419_106338_nicson\in.doc");
    doc.SaveOptions.ExportPrettyFormat = true;
    doc.Save(@"419_106338_nicson\out.html", SaveFormat.Html);
    //Edit html
    //.................................
    //open changed html
    Document doc1 = new Document(@"419_106338_nicson\out.html");
    //Code that will replace «FieldName» text with mergefield
    Regex clauseRegex = new Regex(@"«(?.*?)»");
    doc1.Range.Replace(clauseRegex, new ReplaceEvaluator(ReplaceEvaluator_106338), true);
    if (mergeFields.Count > 0)
    {
        DocumentBuilder builder = new DocumentBuilder(doc1);
        foreach (Run run in mergeFields.Keys)
        {
            run.Text = string.Empty;
            builder.MoveTo(run);
            builder.InsertField(String.Format(@"MERGEFIELD {0} \* MERGEFORMAT", mergeFields[run].ToString()), String.Format("{0}", String.Format("«{0}»", mergeFields[run].ToString())));
        }
    }
    //Simple mailmerge
    string[] test = doc1.MailMerge.GetFieldNames();
    doc1.MailMerge.Execute(test, test);
    doc1.Save(@"419_106338_nicson\out.doc");
}

private ReplaceAction ReplaceEvaluator_106338(object sender, ReplaceEvaluatorArgs e)
{
    Match match = e.Match;
    Group fieldNameGroup = match.Groups["FieldName"];
    string fieldName = fieldNameGroup.Value;
    mergeFields.Add((e.MatchNode as Run), fieldName);
    return ReplaceAction.Skip;
}

I hope that this will help you to solve your task.

Best regards.

Thanks, it works great!
As always, your support is wonderfull!

Hi,
In the same line, is there a way to make mail merge with region work?

For simple mergefield the method if super, but if I’ve a document with merge region will it work?

Hi

Yes this method will work fine for regions too.

Best regards.

Hi,

I tried to use the solution you suggested in order to preserve the merge fields. I have edited my html file and it looks like the image attached(beforemergehtml.jpg).

After i execute the merge the html i receive looks like the image attached (aftermergehtml.jpg)

Everything i have written on the right side of the merge fields is lost.

Can you help me?

Regards,
Eleni

Hi

Thanks for your request. Could you please attach your input and output document here and provide me simple code or application, which will allow me to reproduce the problem? I will check the issue and provide you more information.

Best regards.

Hi,

I have created a demo project that demonstartes my issue.

There are two buttons. The first button converts a doc(lena.doc) file with merge fields to an html file(lena.html). Then i open the html file i edit it and save it as html with a different name (lena1.html).

Then i hit the second button and i try to merge the lena1.html file with some fields (you will see the code in details). And then i save the merged file as lena2.html.

The problem is that the merged fields are not displayed correctly.

Can you help me?

Regards,

Eleni

Hi

Thank you for additional information. The problem might occur because in your code you suppose that merge field text occupies only one run, but it does not. You can inspect structure of your documents using DocumentExplorer (Aspose.Words demo application). Also, I think, the code example provided here could be useful for you:
https://docs.aspose.com/words/net/find-and-replace/

Best regards.

Hi,

I am a little confused. In which point of the code do i have to make modifications?

Inside the ReplaceEvaluator_106338 function?

Could you help me because i am not familiar with aspose words programming.

Could you provide me a solution for my issue?

Regards,

Eleni

Hi

Thanks for your request. Please try using the following code:

// Open the source document.
Document doc = new Document(@"Test001\lena.doc");
// Convert document to HTML.
doc.Save(@"Test001\out.html");
// Open HTML document, and replace placeholders with mergefields.
Document doc1 = new Document(@"Test001\out.html");
doc1.Range.Replace(new Regex("«(?.*?)»"), new ReplaceEvaluator(InsertOptionMergeFieldEvent), false);
// Execute mail merge
doc1.MailMerge.Execute(doc1.MailMerge.GetFieldNames(), doc1.MailMerge.GetFieldNames());
// Save output document
doc1.Save(@"Test001\out.doc");
private static ReplaceAction InsertOptionMergeFieldEvent(object sender, ReplaceEvaluatorArgs e)
{
    // This is a Run node that contains either the beginning or the complete match.
    Node currentNode = e.MatchNode;
    // The first (and may be the only) run can contain text before the match, 
    // in this case it is necessary to split the run.
    if (e.MatchOffset > 0)
        currentNode = SplitRun((Run)currentNode, e.MatchOffset);
    // This array is used to store all nodes of the match for further removing.
    ArrayList runs = new ArrayList();
    // Find all runs that contain parts of the match string.
    int remainingLength = e.Match.Value.Length;
    while (
                    (remainingLength > 0) &&
    (currentNode != null) &&
    (currentNode.GetText().Length <= remainingLength))
    {
        runs.Add(currentNode);
        remainingLength = remainingLength - currentNode.GetText().Length;
        // Select the next Run node. 
        // Have to loop because there could be other nodes such as BookmarkStart etc.
        do
        {
            currentNode = currentNode.NextSibling;
        }
        while ((currentNode != null) && (currentNode.NodeType != NodeType.Run));
    }
    // Split the last run that contains the match if there is any text left.
    if ((currentNode != null) && (remainingLength > 0))
    {
        SplitRun((Run)currentNode, remainingLength);
        runs.Add(currentNode);
    }
    // Create Document Buidler aond insert MergeField
    DocumentBuilder builder = new DocumentBuilder(e.MatchNode.Document as Document);
    builder.MoveTo((Run)runs[runs.Count - 1]);
    string fieldName = e.Match.Groups["FieldName"].Value;
    builder.InsertField(string.Format("MERGEFIELD {0}", fieldName), string.Format("«{0}»", fieldName));
    // Now remove all runs in the sequence.
    foreach (Run run in runs)
        run.Remove();
    // Signal to the replace engine to do nothing because we have already done all what we wanted.
    return ReplaceAction.Skip;
}

/// 
/// Splits text of the specified run into two runs.
/// Inserts the new run just after the specified run.
/// 
private static Run SplitRun(Run run, int position)
{
    Run afterRun = (Run)run.Clone(true);
    afterRun.Text = run.Text.Substring(position);
    run.Text = run.Text.Substring(0, position);
    run.ParentNode.InsertAfter(afterRun, run);
    return afterRun;
}

Hope this helps.

Best regards.

hi,

Thanks for the help. Another thing now.

Inside my first file (lena.doc) i have a merge field(doc_name_bar) with a specific barcode font named(idautomation128). When i open the html field before i perform the merge the barcode font is preserved. After i edit and close the html and i perform the merge. Then the font on the specific merge field is lost.

Is it because we insert the fields programmatically?

How can i preserve the fonts after the merge?

I notice that when i try to change the fonts on the html merge fields and then do the mailmerge the fonts that i have set are not preserved

Regards,

Eleni

Hi

Thanks for your request. Please try using

// Search for run, which contains name of mergefield.
Run nodeToMove = (Run)runs[runs.Count - 1];
foreach (Run run in runs)
{
    // We skip runs, which contains "«" and "»" because they are Arial.
    if (run.Text == "«" || run.Text == "»")
        continue;
    nodeToMove = run;
}
builder.MoveTo(nodeToMove);

instead of using

builder.MoveTo((Run)runs[runs.Count - 1]);

in InsertOptionMergeFieldEvent.

Hope this helps.

Best regards.

Hi,

I want inside a merge field to put a line break in html e.x.
.

I have an html file that holds the merge fields, i merge the values inside the html, but instead of new line i receive
. Look at the attachment.

Can you help me?

Regards,

Eleni

Hi

Thanks for your request. Unfortunately, it is not quite clear for me what the problem is. Could you please attach your sample documents and code that will allow me to reproduce the problem? I will check the issue and provide you more information.

Best regards,

Thanks, I figured it out.

Regards,

Eleni

Just a quick note, in the most recent versions of Aspose.Words this functionality can be achieved directly by using the new mustache template syntax which allows mail merging from plain text field markers.

Please see the following blog post for further information

Thanks,