While converting a simple 2003 .doc document with hyperlink to 2007 .docx document, I have noticed that an extra set of tags are produced. Although it has no impact on the rendering of the text in Word2007, it has some hindrance while reading the OOXML. These tags are not produced while converting the same document in Word 2007.
The text given below in the box is the text that I gave in the .doc document.
<TABLE class=MsoNormalTable style="BORDER-RIGHT: medium none; BORDER-TOP: medium none; BORDER-LEFT: medium none; BORDER-BOTTOM: medium none; BORDER-COLLAPSE: collapse; mso-border-alt: solid black .5pt; mso-yfti-tbllook: 1184; mso-padding-alt: 0cm 5.4pt 0cm 5.4pt; mso-border-insideh: .5pt solid black; mso-border-insidev: .5pt solid black" cellSpacing=0 cellPadding=0 border=1>
<TBODY>
<TR style="mso-yfti-irow: 0; mso-yfti-firstrow: yes; mso-yfti-lastrow: yes">
<TD style="BORDER-RIGHT: black 1pt solid; PADDING-RIGHT: 5.4pt; BORDER-TOP: black 1pt solid; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0cm; BORDER-LEFT: black 1pt solid; WIDTH: 478.8pt; PADDING-TOP: 0cm; BORDER-BOTTOM: black 1pt solid; BACKGROUND-COLOR: transparent; mso-border-alt: solid black .5pt" vAlign=top width=638>
This is a hyperlink
<A href="http://www.google.com/">http://www.google.com</A>
The above text is a hyperlink.
</TD>
</TR>
</TBODY>
</TABLE>
OOXML part for the hyperlink while converting using Word2007.
<w:p w:rsidR="001D69CA" w:rsidRDefault="001D69CA">
<w:hyperlink r:id="rId4" w:history="1">
<w:r w:rsidRPr="006E47C0">
<w:rPr>
<w:rStyle w:val="Hyperlink" />
</w:rPr>
<w:t>http://www.google.com</w:t>
</w:r>
</w:hyperlink>
</w:p>
OOXML part for the hyperlink while converting using Aspose.Word (version 5.2.0.0)
<w:p w:rsidR="001D69CA">
<w:r>
<w:fldChar w:fldCharType="begin" />
</w:r>
<w:r>
<w:instrText xml:space="preserve">HYPERLINK "http://www.google.com"</w:instrText>
</w:r>
<w:r>
<w:fldChar w:fldCharType="separate" />
</w:r>
<w:r w:rsidRPr="006E47C0">
<w:rPr>
<w:rStyle w:val="Hyperlink" />
</w:rPr>
<w:t>http://www.google.com</w:t>
</w:r>
<w:r>
<w:fldChar w:fldCharType="end" />
</w:r>
</w:p>
Question : Why is the extra tag “w:instrText” coming up along with the “w:rStyle” tag? This is hindering our progress as these tags are also being processed giving unexpected results.
Attached is the document which I used before conversion