I’m receiving the following error when trying to open a Word document - this does not happen all of the time:
java.nio.charset.UnsupportedCharsetException: UTF-7
When I open the document in Word itself it works fine. What I’ve noticed is that if I perform a Save As from Word, it comes up with save as a Single File Web Page as the default. If I switch the Export type to a Word document and save it. I can process the document just fine using Aspose.
I’m using Aspose in a integration with Documentum. So when I recieve the document I’m getting it as a byte stream. Here is the code:
ByteArrayInputStream oBAInput = dfObject.getContent();
Document doc = new Document(oBAInput);
The above code fails during the creation of the doc object. What can I do to handle these word documents? Before upgrading to the latest version (2.6.1) of Aspose I would receive and unknown format error.
Thanks
This document is a Single File Web Page, also known as a Web Archive file. If you are seeing this message, your browser or editor doesn’t support Web Archive files. Please download a browser that supports Web Archive, such as Microsoft Internet Explorer.
Hi,
Your doc has a wrong extension. It is really in MHTML format (and should be .mhtml instead of .doc). Aspose.Words for Java doesn’t support MHTML yet – so it throws.
But I don’t understand about ‘UnsupportedCharsetException: UTF-7’ – in what version of Aspose.Words you get it?
And another thing. It will be better to download the fixed v.2.6.1 – it was broken by some unclear reason (forthcoming Friday, 13?). You can get it from the old address: https://releases.aspose.com/words/net .
Regards,
I receive the same error “java.nio.charset.UnsupportedCharsetException:
UTF-7” when attempting to open an .htm or .html file. This only occurs
when using Aspose.Words for Java 2.7.0. When I use Aspose.Words for
Java 2.6.1 this error does not occur. The file is actually an HTML
file and not MHTML.
Document doc = new Document("C:\Dev\testing.html");
With the previous Aspose release my export(HTML to DOC) was working fine.
Now all the documents are reporting java.nio.charset.UnsupportedCharsetException: UTF - 7
same with DOCX
here is the stack trace.
java.lang.Exception: java.nio.charset.UnsupportedCharsetException: UTF-7
java.lang.Exception: java.nio.charset.UnsupportedCharsetException: UTF-7
at asposewobfuscated.lz.tk(Unknown Source)
at asposewobfuscated.lz.tg(Unknown Source)
at asposewobfuscated.lz.R(Unknown Source)
at com.aspose.words.FileFormatUtil.i(Unknown Source)
at com.aspose.words.FileFormatUtil.a(Unknown Source)
at com.aspose.words.Document.a(Unknown Source)
at com.aspose.words.Document.(Unknown Source)
at com.aspose.words.Document.(Unknown Source)
at com.hylighter.document.DocxExportHandler.export(DocxExportHandl
Thank you for additional information. I cannot reproduce the problem on my side. Did you do anything with Aspose.Words.jar? Maybe you have obfuscated it or something else.
Please tell me how you use Aspose.Words in your application.
Best regards.
I just moved the jars to my application lib folder. Nothing else.
We have a web application, in which we import document (DOC -> HTML).
We do some processing and customization
And export back to DOC or DOCX.
We do tidy clean up(before or after) where every HTML plays in the conversion.
Finally I got it working with the other Document constructor.
Document(file, loadformat, "");
I still don’t know why the same file worked on my standalone application failed on my web application.
Could be something with the automatic format detection?
Hi,
You right, your deobfuscated stack trace points to automatic format detection code. But:
This code branch used by every application that uses automatic format detection, i.e. new Document(filename) overload. Probably the bigger half of users are using this overload but you are single how reported such an issue.
Your code reports that can’t find UTF-7 Charset. Since core java does not support UTF-7 Charset we include it in Aspose.Words jar. Code itself obfuscated within the jar and referenced by META-INF\services\java.nio.charset.spi.CharsetProvider file. Probably you somehow changed the jar: obfuscated it or modified META-INF\services folder so java’s CharsetProvider service can’t find our UTF-7. Another option – conflict of CharsetProviders in your environment (somewhat like JarHell).
Please, inform me: 1) are you modified Aspose.Words jar; 2) Is there another applications in your web environment that can conflict with our UTF-7 CharsetProvider?
Best Regards,