Hi,
I'm looking at your product for a potential customer. But the key point to the buy decision is how to extract clean text out of the EMF file.
I'm using the following code snipet:
FileInputStream file = new FileInputStream(fileName);
EmfMetafile efile = new EmfMetafile(file);
MetafileComment[] metaFileComment = efile.getComments();
for (int i = 0; i < metaFileComment.length; i++) {
String temp = Charset.defaultCharset().decode(ByteBuffer.wrap(metaFileComment[i].getCommentData())).toString().replaceAll("[\u0000-\u001F]", "").replaceAll("[\u007A-\uFFFF]", "");
if (temp.indexOf("F+@") == -1 && temp.indexOf("*>*>?@,") == -1) {
temp = temp.substring(temp.indexOf("*>*>?@") + 6);
System.out.println(" -- metaFileComment " + metaFileComment[i].getRecordIndex() + ": " +
temp );
}
}
By doing the above text stripping, I was able to remove some of the junks text. However, there are still alot more which I have no idea how to filter it out.
Here's a sample output when I ran it (please note, I'd like to extract "cleanly" only the text in RED color below).
[java] Reading file C:\project\Aspose.Total.Java\Aspose.Metafiles\.\{3258E1A7-0A80-40DA-967B-D7232536DC7F}.emf
[java] File content: com.aspose.metafiles.EmfMetafile@341960
[java] -- metaFileComment 3767: 4(!E!s%DDWG NO
[java] -- metaFileComment 3789: 0$7E!=DREV
[java] -- metaFileComment 3798: 4(E'D402711
[java] -- metaFileComment 3949: 4(EyIDTITLE
[java] -- metaFileComment 3958: 0$EWIDDATE
[java] -- metaFileComment 3965: 4(D7NDSTRESS
[java] -- metaFileComment 3972: 8,DxLDCHECKED
[java] -- metaFileComment 3979: 4(DJDDRAWN
[java] -- metaFileComment 3986: <0%D4IDAPPROVALS
[java] -- metaFileComment 3993: @4DPEDCONTRACT NO.
[java] -- metaFileComment 4000: 0$EPDSIZE
[java] -- metaFileComment 4007: <0EvPDCAGE CODE
[java] -- metaFileComment 4014: 8,ESDSCALE
[java] -- metaFileComment 4021: 0$nESD1/4
[java] -- metaFileComment 4028: 8,EwPDDWG. NO.
[java] -- metaFileComment 4035: 0$3EoPDREV
[java] -- metaFileComment 4042: D8.E SDSHEET
[java] -- metaFileComment 4049: 8,oE SD1
[java] -- metaFileComment 4056: 4(4E SDOF
[java] -- metaFileComment 4070: THEEDSARGENT FLETCHER INC.
[java] -- metaFileComment 4079: TH@EGD9400 E. FLAIR DR. -
[java] -- metaFileComment 4088: D8p=EGDEL MONTE, CA
[java] -- metaFileComment 4095: 4(EGD91731
[java] -- metaFileComment 4102: 4(E"KQD72429
[java] -- metaFileComment 4118: 8,ESDCALC.WT.
[java] -- metaFileComment 4127: \PD2EDUNLESS OTHERWISE SPECIFIED
[java] -- metaFileComment 4134: XLDtWFDDIMENSIONS ARE IN INCHES
[java] -- metaFileComment 4141: H<D*GDTOLERANCES ARE:
[java] -- metaFileComment 4148: D8DGDANGLES
[java] -- metaFileComment 4155: 8,pYDGDDECIMALS
[java] -- metaFileComment 4162: PD9DLDDO NOT SCALE DRAWING
[java] -- metaFileComment 4169: @4DSDAPPLICATION
[java] -- metaFileComment 4176: <0DQDNEXT ASSY
[java] -- metaFileComment 4183: 8,jDRrQDUSED ON
[java] -- metaFileComment 4190: 0$DQDPART
[java] -- metaFileComment 4197: 0$DRDDASH
[java] -- metaFileComment 4204: 0$D\SDNO.
[java] -- metaFileComment 4211: <0EMCDPARTS LIST
[java] -- metaFileComment 4220: 8,(DBDQTY REQD
[java] -- metaFileComment 4229: <04PD_BDCAGE CODE
[java] -- metaFileComment 4238: XL-DhxBDPART OR IDENTIFYING NO.
[java] -- metaFileComment 4247: `TEjBDNOMENCLATURE OR DESCRIPTION
[java] -- metaFileComment 4256: THEn_BDMATERIAL SPECIFICATION
[java] -- metaFileComment 4263: 0$ETBDZONE
[java] -- metaFileComment 4270: 0$DgEtADFIND
[java] -- metaFileComment 4277: 0$EBDNO.
[java] -- metaFileComment 4284: THOD5JDDO NOT REVISE MANUALLY
[java] -- metaFileComment 4316: 0$&DHD30'
[java] -- metaFileComment 4323: 0$$TDID.X
[java] -- metaFileComment 4344: 0$D;ID.XX
[java] -- metaFileComment 4358: 0$D;ID.03
[java] -- metaFileComment 4365: 4(TDJD.XXX
[java] -- metaFileComment 4379: 0$DJD.010
[java] -- metaFileComment 4386: D8DQODMANUFACTURING
[java] -- metaFileComment 4393: @4DQDPROJECT ENGR
[java] -- metaFileComment 4400: <0D+jSDCHIEF ENGR
[java] -- metaFileComment 4407: 0$eDGSDDATE
[java] -- metaFileComment 4414: PDjDSDADDITIONAL APPROVALS
[java] -- metaFileComment 4421: 8,DQODDESIGNER
[java] -- metaFileComment 4428: L@pDQDQUALITY ASSURANCE
[java] -- metaFileComment 4435: THcD,LDTHIRD ANGLE PROJECTION
[java] -- metaFileComment 4444: H<Dk6NDDESIGN ACTIVITY
[java] -- metaFileComment 4453: TH=EKDSHELL, CENTER SECTION
[java] -- metaFileComment 4462: 4(QEgPD402711
[java] -- metaFileComment 4469: <0ERqAREVISIONS
[java] -- metaFileComment 4478: @4E0JADESCRIPTION
[java] -- metaFileComment 4487: 0$\EADATE
[java] -- metaFileComment 4494: 8,sE0AAPPROVED
[java] -- metaFileComment 4501: 0$sDFAZONE
[java] -- metaFileComment 4508: 0$hEnAREV
[java] -- metaFileComment 4515: 4(3E+AVDCATIA
[java] -- metaFileComment 4524: 4(VDRDDWG NO
[java] -- metaFileComment 4540: 0$-DRDREV
[java] -- metaFileComment 4563: 4(%DuRD402711
[java] -- metaFileComment 4570: @@ AAICA
[java] -- metaFileComment 4610: ?A\ATHIS REPRODUCTION IS A PROPRIETARY DESIGN AND IS A CONFIDENTIAL
[java] -- metaFileComment 4619: @A<ADISCLOSURE BY SARGENT FLETCHER INC., EL MONTE, CALIFORNIA. IT IS
[java] -- metaFileComment 4626: t+AALOANED SUBJECT TO THE CONDITIONS THAT IT:
[java] -- metaFileComment 4633: PD4CA1) SHALL BE USED FOR
[java] -- metaFileComment 4640: h\ABRECORD AND REFERENCE PURPOSE;
[java] -- metaFileComment 4647: l`"uKBB2) SHALL NOT BE USED NOR CAUSED TO
[java] -- metaFileComment 4654: BA+BBE USED FOR PROCUREMENT OR IN ANY OTHER WAY PREJUDICIAL TO SARGENT
[java] -- metaFileComment 4661: H<AnABFLETCHER INC.;
[java] -- metaFileComment 4668: /JBnAB3)SHALL NOT BE REPRODUCED OR COPIED IN WHOLE OR
[java] -- metaFileComment 4675: <0AnWBIN PART;
[java] -- metaFileComment 4682: 5,=BnWB4) SHALL NOT BE USED TO PRODUCE OR MANUFACTURE ITEMS
[java] -- metaFileComment 4689: AA^mBEXCEPT WITH THE EXPRESS WRITTEN CONSENT OF SARGENT FLETCHER INC.;
[java] -- metaFileComment 4696: =AB5) SHALL NOT BE RELEASED TO A THIRD PARTY WITHOUT THE EXPRESS
[java] -- metaFileComment 4703: BA'BWRITTEN CONSENT OF SARGENT FLETCHER INC.; AND 6) SHALL BE RETURNED
[java] -- metaFileComment 4710: @4ABUPON DEMAND.
[java] -- metaFileComment 4717: @@ D[:DE[:D
[java] -- metaFileComment 4866: 8,D;:DER 4043
[java] -- metaFileComment 4873: @4hD;:DWELDING ROD
[java] -- metaFileComment 4880: H<D;:DANSI/AWS A 5.10
[java] -- metaFileComment 4887: 0$E;:D1G15
[java] -- metaFileComment 4915: 0$hD:@<DSKIN
[java] -- metaFileComment 4922: <0D:@<DSEE NOTE 3
[java] -- metaFileComment 4929: 0$E:@<D1B5
[java] -- metaFileComment 4957: 0$hD9=DSKIN
[java] -- metaFileComment 4964: <0D9=DSEE NOTE 3
[java] -- metaFileComment 4971: 0$E9=D1B7
[java] -- metaFileComment 4999: 0$hD7F?DSKIN
[java] -- metaFileComment 5006: <0D7F?DSEE NOTE 3
[java] -- metaFileComment 5013: 0$E7F?D1B9
[java] -- metaFileComment 5027: 4(15ETQD-
[java] -- metaFileComment 5036: 8,XABNOTES:
[java] -- metaFileComment 5045: \PNBBBUNLESS OTHERWISE SPECIFIED
[java] -- metaFileComment 5052: 4(XAB1.
[java] -- metaFileComment 5059: pd#BBINTERPRET DRAWING PER ASME Y14.100.
[java] -- metaFileComment 5066: 4(XAM>B2.
[java] -- metaFileComment 5073: 3BM>BDIMENSIONING AND TOLERANCING PER ASME Y14.5M -1994.
[java] -- metaFileComment 5080: 4(XAOC3
[java] -- metaFileComment 5087: 4BOCMATERIAL: .090 AL ALY SH 6061-0 PER AMS-QQ-A-250/11.
[java] -- metaFileComment 5094: 4(XAjC4.
[java] -- metaFileComment 5101: 8BjCAFTER FORMING FIND NO 1, 2, & 3, SOLUTION HEAT TREAT AND
[java] -- metaFileComment 5108: 1'&BCARTIFICIALLY AGE TO T42 CONDITION PER MIL-H-6088.
[java] -- metaFileComment 5115: 4(XAZ'C5
[java] -- metaFileComment 5122: 8BZ'CFUSION WELD PER AWS D17.1:2001 CLASS B USING FIND NO. 4.
[java] -- metaFileComment 5129: 4(XA95C6.
[java] -- metaFileComment 5136: dXB95CWELDING SYMBOLS PER AWS A2.4.
[java] -- metaFileComment 5143: 4(XABC7
[java] -- metaFileComment 5150: =BBCRUBBER STAMP OR STENCIL WITH 72429/402711 AND APPLICABLE DASH
[java] -- metaFileComment 5157: <'&BICNO PER MIL-STD-130, CHARACTER SIZE OPTIONAL. USE CONTRASTING
[java] -- metaFileComment 5164: 0'&B:PCCOLOR INK PER A-A-56032. LOCATE APPROX AS SHOWN.
[java] -- metaFileComment 5189: 4(DxOD402771
[java] -- metaFileComment 5196: 0$4DxODF-2
[java] -- metaFileComment 5203: 8,H.DJDJ. GRAF
[java] -- metaFileComment 5210: 8,\EJD03-07-12
[java] -- metaFileComment 5219: L@%EjAPRODUCTION RELEASE
[java] -- metaFileComment 5235: @@ K[CeAKCeA
[java] -- metaFileComment 5275: 4K[CRfATHIS DRAWING IS SUBJECT TO THE INTERNATIONAL TRAFFIC
[java] -- metaFileComment 5284: dXK[CiBIN ARMS REGULATIONS (ITAR).
[java] -- metaFileComment 5291: THXgCiBIT MAY NOT BE EXPORTED
[java] -- metaFileComment 5298: 2K[Cu,BFROM THE UNITED STATES OR TRANSFERRED TO A FOREIGN
[java] -- metaFileComment 5305: 0K[CGBPERSON WITHOUT THE PRIOR WRITTEN APPROVAL OF THE
[java] -- metaFileComment 5312: \PK[C)8cBU.S. DEPARTMENT OF STATE.
[java] -- metaFileComment 5319: @
[java] Reading file C:\project\Aspose.Total.Java\Aspose.Metafiles\.\{8C9BB600-6126-4050-BBAE-1CFE6B8669C6}.emf
[java] File content: com.aspose.metafiles.EmfMetafile@1d2fc36
[java] -- metaFileComment 190: @ :ac/@ C-@ :aBcC@:[BAbC@@@@
[java] -- metaFileComment 200: @@ :YB^C:YBiC
[java] -- metaFileComment 334: @ :ac/@ C-@ :aBcC@:[BAbC@@@@
[java] -- metaFileComment 344: 0$SABNOTE
[java] -- metaFileComment 352: THSAI[B1. END SHAPE OF TUBE:
[java] -- metaFileComment 359: l`"ABTOLERANCE SHALL BE APPLIED TO AREA
[java] -- metaFileComment 366: dXABWITHIN 20mm FROM CANISTER END
[java] -- metaFileComment 373: l`"AUtBWITHIN 17mm FROM TWO WAY VALVE END
[java] -- metaFileComment 380: L@SA!C2. PRINT ON TUBE:
[java] -- metaFileComment 387: t+A'CMANUFACTURER'S TRADEMARK, NOMINAL DIA., AND
[java] -- metaFileComment 394: PDAg=,CMANUFACTURED DATE
[java] -- metaFileComment 401: h\Bg=,COR LOT NUMBER (ABBREVIATION) TO
[java] -- metaFileComment 408: THA>l1CBE PRINTED REPEATEDLY.
[java] -- metaFileComment 415: XLSA6C3. MARKING OF TUBE END:
[java] -- metaFileComment 422: /A;CMARKING POSITION AND ANGLE DEPENDS ON 3D MODEL.
[java] -- metaFileComment 429: pd$A@CSHAPE AND COLOR TO BE SHOWN AS NOTE.
[java] -- metaFileComment 436: 1SA@eC4. BEND RADII SHALL BE R15 ALONG TUBE CENTERLINE.
[java] -- metaFileComment 443: `TSAojC5. ASSEMBLY CONDITION:CLAMP
[java] -- metaFileComment 450: 3AToCASSEMBLY LOCATION BEFORE DELIVERY TO BE COORDINATED
[java] -- metaFileComment 457: `TA+tCBETWEEN SUPPLIER AND PLANT.
[java] -- metaFileComment 464: 3AyCASSEMBLY LOCATION IN THIS DWG. SHOWS ACTUAL VEHICLE
[java] -- metaFileComment 471: PDA*CASSEMBLY CONDITION.
[java] -- metaFileComment 478: 0SA,C6. PROTECTION FOR TUBE ENDS AND DUST PROOF TO BE
[java] -- metaFileComment 485: xl(ADCCOORDINATED BETWEEEN SUPPLIER AND PLANT.
[java] -- metaFileComment 492: @@ gBZCBZC
[java] -- metaFileComment 502: H<gB0OCMARKING (WHITE)
[java] -- metaFileComment 525: @4DBTC3 OR EQUIV)
[java] -- metaFileComment 534: @@ ifBbCgBZC
[java] -- metaFileComment 644: 0$!5,Bl2C5.5
[java] -- metaFileComment 664: @@, @0$BOvCBO6CB. *CBJC
[java] -- metaFileComment 744: 0$B!C10.5
[java] -- metaFileComment 765: @@, @0$:YBGUC:]BGUCg*AGUCBGUC
[java] -- metaFileComment 854: 4(g*ACPCAPPRX.
[java] -- metaFileComment 861: @
Thanks.
Brandon