|
|
PDFExtractor - Other data about extracted attachments?
Last post 03-24-2009, 11:30 PM by forever. 14 replies.
-
03-03-2009, 3:26 PM |
|
|
PDFExtractor - Other data about extracted attachments?
Hi,
I'm currently working on using PDFExtractor to extract data and attachments from a dynamic PDF form. There are several attachments, mapped to different fields / questions on the form by using an attachmentName (not the filename) i.e. using importDataObject(attachmentName)
I am unable to map the attachments to the fields after extraction because:
- It seems that PDFExtractor currently deals with extracting the file data (PDFExtractor.GetAttachment()) and the file names (PDFExtractor.GetAttachNames()) only - which won't help because the file names may be the same..
- There doesn't seem to be any order to the extraction of the attachments
Are there ways to get any other info about the attachments? And if there currently aren't any, do you plan on making this possible in future releases? Around when?
I'd greatly appreciate anyone's help / any other suggestions..
|
|
-
03-03-2009, 11:22 PM |
|
|
Re: PDFExtractor - Other data about extracted attachments?
Hi,
Thank you very much for considering Aspose.
I'm not sure that I can clearly understand your requirement. Can you please share the PDF file you're working with and then elaborate a little bit that what kind of information and data you're trying to extrat from it. That way, I'll understand the requirement clearly and will be in a better position to help you out.
We appreciate your patience.
Regards,
Shahzad Latif - [ Follow me on Twitter!] Support Developer/Developer Evangelist Aspose Sialkot Team Aspose - Your File Format Experts Keep in touch! We're on Twitter and Facebook
|
|
-
03-03-2009, 11:54 PM |
-
zyraen809
-
-
-
Joined on 11-18-2008
-
-
Posts 23
-
-
-
-
-
|
Re: PDFExtractor - Other data about extracted attachments?
Hello,
Sorry, I'm not able to give out the PDF...
Basically, it's a dynamic PDF application form.
People who fill out the form answer several questions, some of which require that a file be attached (We use importDataObject(attachmentName) to grab the files from them, we don't require them to use the built-in paperclip button for attaching files). They then save the form to their computer, then upload to our server.
After we receive the pdf file, we would like to extract from the pdf the: xml data which includes the answers for each question, and the attachments.
Aspose.PDF.Kit enables us to do this. The only problem is the extraction of attachments - We need to be able to match an attachment to its related question. I can't find a way to do this because the only information PDFExtractor seems to give are: - byte streams of the attachments - filenames of the attachments
^ This is insufficient for mapping the attachments to their questions because filenames of several attachments may be the same.
Because our PDF form has got hidden fields that are used to group data about attachments (e.g. attachmentName (used to map to question number), fileName, size), the workaround currently in place is to match attachments to the questions by using that xml data to match filenames and size. This would work for now, but I was hoping PDFExtractor would support the extraction of other info about the attachments, such as the attachmentName (passed in when using importDataObject(attachmentName))
Please let me know what bits are still unclear~ And thanks for your help!
|
|
-
03-04-2009, 12:10 AM |
|
|
Re: PDFExtractor - Other data about extracted attachments?
Hi,
Thank you very much for sharing detailed information.
I'll update you regarding your requirement the earliest possible.
Regards,
Shahzad Latif - [ Follow me on Twitter!] Support Developer/Developer Evangelist Aspose Sialkot Team Aspose - Your File Format Experts Keep in touch! We're on Twitter and Facebook
|
|
-
03-04-2009, 1:47 AM |
|
|
Re: PDFExtractor - Other data about extracted attachments?
Hi,
PdfExtractor class provides a variety of methods to extract attachments. You can use ExtractAttachment method to extract all the attachments, or you can use ExtractAttachment(string attachmentFileName) method to extract an individual attachment. You can also get a list of all the attachment names using GetAttachNames method. Also GetAttachment method gives you an array of memory stream of all the attachments in the pdf file. You can also find a detailed example on using GetAttachNames and GetAttachment methods to get a list of the names and an array of the attachment streams.
Also, I would like to add one more thing; if you use following overload of the CreateFileAttachment method, you can also name each attachment uniquely -
I hope this helps achieve your requirement. If you still find any issue or some more questions please feel free to ask.
Regards,
Shahzad Latif - [ Follow me on Twitter!] Support Developer/Developer Evangelist Aspose Sialkot Team Aspose - Your File Format Experts Keep in touch! We're on Twitter and Facebook
|
|
-
03-04-2009, 2:30 PM |
-
zyraen809
-
-
-
Joined on 11-18-2008
-
-
Posts 23
-
-
-
-
-
|
Re: PDFExtractor - Other data about extracted attachments?
Hi,
Thanks for that~ However, GetAttachment and GetAttachNames are what we currently use - GetAttachNames returns just the "attachment filenames", whereas I was hoping we could get the "attachment names".
i.e. GetAttachNames returns Data.path for each of the attachments, whereas I'm looking for Data.name (please refer to JavaScript Objects, Properties, and Methods Available in Adobe Reader - Data Object on page 22 of the Developing for Adobe Reader document)
Cheers~!
|
|
-
03-04-2009, 10:28 PM |
|
|
Re: PDFExtractor - Other data about extracted attachments?
Hi,
I'm looking into this and will reply you the earliest possible.
We're sorry for the inconvenience.
Regards,
Shahzad Latif - [ Follow me on Twitter!] Support Developer/Developer Evangelist Aspose Sialkot Team Aspose - Your File Format Experts Keep in touch! We're on Twitter and Facebook
|
|
-
03-04-2009, 11:24 PM |
|
|
Re: PDFExtractor - Other data about extracted attachments?
Hi,
I'm using following code to create attachment. In this code I'm explicitly specifying the attachment name, so after creating attachments, when I try to use GetAttachNames it gives me attachment names not attachment file names. Please have a look on the below code snippet:
PdfContentEditor editor = new PdfContentEditor(); string KitTestPath = common.Path("Aspose.Pdf.Kit.pdf"); string KitTestOut = common.Path("Aspose.Pdf.Kit_withFileAttached.pdf"); string attachfilepath = common.Path("1.pdf"); System.IO.FileStream attachstream = new System.IO.FileStream(attachfilepath, System.IO.FileMode.Open, System.IO.FileAccess.Read); System.Drawing.Rectangle rect = new System.Drawing.Rectangle(50, 50, 100, 100); editor.BindPdf(KitTestPath); editor.CreateFileAttachment(rect, "Test Attachment",attachstream,"firstattachment", 1, "Graph"); rect = new System.Drawing.Rectangle(50, 150, 100, 100); editor.CreateFileAttachment(rect, "Test Attachment", attachstream, "secondattachment", 1, "Graph"); editor.Save(KitTestOut);
Try using this and if you still find any issue, please do let us know.
Regards,
Shahzad Latif - [ Follow me on Twitter!] Support Developer/Developer Evangelist Aspose Sialkot Team Aspose - Your File Format Experts Keep in touch! We're on Twitter and Facebook
|
|
-
03-09-2009, 11:14 PM |
-
zyraen809
-
-
-
Joined on 11-18-2008
-
-
Posts 23
-
-
-
-
-
|
Re: PDFExtractor - Other data about extracted attachments?
Hi,
Thanks so much for your help, but unfortunately, I think we've still got different understandings of the requirements.
I won't be able to use CreateFileAttachment because it's the end users who attach the files to the PDF.
The end-users fill out the PDF form, attaching files if necessary. The PDF form has got several questions which require end-users to attach a file. This is implemented using javascript - on click of the button, we call importDataObject(attachmentName) which opens a file dialog allowing the user to attach a file. They then save the form to their computer, then upload to our server.
When the form gets to the server, we need to extract all data and attachments from the form. I think my main problem is that I cannot get the information I require from the attachments. i.e. According to the Adobe documentation:
The available information about an attachment (Data) object are: creationDate modDate MIMEType name (I need this to map the attachments to their corresponding questions) path (i.e. filename) size
Whereas the Aspose.PDF.Kit functions I know of can give me only the contents, Data.size (GetAttachment) and the Data.path (GetAttachNames).
I was wondering whether there's more information about the attachments available somewhere, or whether it's possible such information will be made available in future releases of Adobe.Pdf.Kit?
Thanks!
|
|
-
03-10-2009, 1:33 AM |
|
|
Re: PDFExtractor - Other data about extracted attachments?
Hi,
Thank you very much for sharing the details. I'll further investigate the issue at my end and will update you the earliest possible.
We really appreciate your patience.
Regards,
Shahzad Latif - [ Follow me on Twitter!] Support Developer/Developer Evangelist Aspose Sialkot Team Aspose - Your File Format Experts Keep in touch! We're on Twitter and Facebook
|
|
-
03-11-2009, 2:08 AM |
|
|
Re: PDFExtractor - Other data about extracted attachments?
Hi,
I'm sorry to inform you that after further investigation we have found that this feature is currently not supported. It has been logged in our issue tracking system as PDFKITNET-7832. You'll be updated via this forum as this feature is added in the component.
We appreciate your patience.
Regards,
Shahzad Latif - [ Follow me on Twitter!] Support Developer/Developer Evangelist Aspose Sialkot Team Aspose - Your File Format Experts Keep in touch! We're on Twitter and Facebook
|
|
-
03-11-2009, 5:29 PM |
-
zyraen809
-
-
-
Joined on 11-18-2008
-
-
Posts 23
-
-
-
-
-
|
Re: PDFExtractor - Other data about extracted attachments?
|
-
03-12-2009, 3:26 AM |
-
seawolf
-
-
-
Joined on 04-01-2006
-
-
Posts 180
-
-
-
-
-
|
Re: PDFExtractor - Other data about extracted attachments?
Hi,
We are sorry that we are unsure of the name concept in attachment information. If you are convenient please post a sample document for our investigation, it will be grateful.
Allen Wen Developer Aspose Changsha Team About Us Contact Us
|
|
-
03-12-2009, 5:55 PM |
-
zyraen809
-
-
-
Joined on 11-18-2008
-
-
Posts 23
-
-
-
-
-
|
Re: PDFExtractor - Other data about extracted attachments?
Attachment: Present (inaccessible)
Hello,
I've attached a form that was created using Adobe LiveCycle Designer.
The code behind the Attach button is as follows:
var pdf = event.target; pdf.importDataObject("AnAttachment"); var attachmentObject = pdf.getDataObject("AnAttachment"); DataObject.path.rawValue = attachmentObject.path; // this just sets the text field to display the file name
Using Adobe Acrobat Pro (currently I'm using v8.1.3) to open the form, you'd be able to:
1) Click Attach button to attach a random file 2) Hit CTRL+J (open Javascript Debugger) 3) Use the Javascript Debugger to evaluate this.dataObjects[0].path 4) Use the Javascript Debugger to evaluate this.dataObjects[0].name
If it works the way I think it should, this.dataObjects[0].path would be the name of the file you attached (i.e. what is returned using GetAttachNames()), and this.dataObjects[0].name would return "AnAttachment", which is what would be good to get during the extraction.
Please let me know if you need more info,
Cheers~
|
|
-
03-24-2009, 11:30 PM |
|
|
Re: PDFExtractor - Other data about extracted attachments?
|
|
|