Reputation: 4819
I need to extract video file which is embedded in pdf file. i could find the video which is in annotation so that i can't save it separately. i need to save this file how do i achieve this?
Ex: iTextSharp - how to open/read/extract a file attachment?
he has extracted attachement like the way i need to extract the video.
here is my code:
string FileName = AppDomain.CurrentDomain.BaseDirectory + "raven test.pdf";
PdfReader pdfreader = new PdfReader(FileName);
PdfDictionary PageDictionary = pdfreader.GetPageN(1);
PdfArray Annots = PageDictionary.GetAsArray(PdfName.ANNOTS);
if ((Annots == null) || (Annots.Length == 0))
return;
foreach (PdfObject oAnnot in Annots.ArrayList)
{
PdfDictionary AnnotationDictionary = (PdfDictionary)PdfReader.GetPdfObject(oAnnot);
if (AnnotationDictionary.Get(PdfName.SUBTYPE).Equals(PdfName.RICHMEDIA))
{
if (AnnotationDictionary.Keys.Contains(PdfName.RICHMEDIACONTENT))
{
PdfDictionary oRICHContent = AnnotationDictionary.GetAsDict(PdfName.RICHMEDIACONTENT); // here i could see the video embeded but it is in annotation, how do i save this file?
}
}
}
Upvotes: 0
Views: 1754
Reputation: 55467
For this one you'll want to reference the Adobe Supplement to ISO 32000, BaseVersion 1.7, ExtensionLevel 3 official spec. Below is the basic code although you'll probably want to throw in some more null
checks. See the comments for any questions. Just a note, not all embedded movies use the RichMedia format, some are just special attachments so this won't get them all.
PdfReader pdfreader = new PdfReader(FileName);
PdfDictionary PageDictionary = pdfreader.GetPageN(1);
PdfArray Annots = PageDictionary.GetAsArray(PdfName.ANNOTS);
if ((Annots == null) || (Annots.Length == 0))
return;
foreach (PdfObject oAnnot in Annots.ArrayList) {
PdfDictionary AnnotationDictionary = (PdfDictionary)PdfReader.GetPdfObject(oAnnot);
//See if the annotation is a rich media annotation
if (AnnotationDictionary.Get(PdfName.SUBTYPE).Equals(PdfName.RICHMEDIA)) {
//See if it has content
if (AnnotationDictionary.Contains(PdfName.RICHMEDIACONTENT)) {
//Get the content dictionary
PdfDictionary RMC = AnnotationDictionary.GetAsDict(PdfName.RICHMEDIACONTENT);
if (RMC.Contains(PdfName.ASSETS)) {
//Get the assset sub dictionary if it exists
PdfDictionary Assets = RMC.GetAsDict(PdfName.ASSETS);
//Get the names sub array.
PdfArray names = Assets.GetAsArray(PdfName.NAMES);
//Make sure it has values
if (names.ArrayList.Count > 0) {
//A single piece of content can have multiple assets. The array returned is in the form {name, IR, name, IR, name, IR...}
for (int i = 0; i < names.ArrayList.Count; i++) {
//Get the IndirectReference for the current asset
PdfIndirectReference ir = (PdfIndirectReference)names.ArrayList[++i];
//Get the true object from the main PDF
PdfDictionary obj = (PdfDictionary)PdfReader.GetPdfObject(ir);
//Get the sub Embedded File object
PdfDictionary ef = obj.GetAsDict(PdfName.EF);
//Get the filespec sub object
PdfIndirectReference fir = (PdfIndirectReference)ef.Get(PdfName.F);
//Get the true file stream of the filespec
PRStream objStream = (PRStream)PdfReader.GetPdfObject(fir);
//Get the raw bytes for the given object
byte[] bytes = PdfReader.GetStreamBytes(objStream);
//Do something with the bytes here
}
}
}
}
}
}
Upvotes: 1