Reputation: 6403
I managed to run below shell script code in applescript and the outpur generated correctly.
Shell Script.
do shell script "mdls 'UAT.pdf' | grep 'kMDItemPageWidth\\|kMDItemPageHeight' | awk '/ = / {for (i = 3; i <= NF; i++) printf \"%s \", $i/72; printf \"\\n\";}'"
The script basically, read PDF document properties.
How do I run this in AutoIT. Any reference or advice is highly appreciated.
Upvotes: 0
Views: 538
Reputation: 3429
Now we're getting somewhere! @chinna_82, This is what I advise. Meaning, this is the way I would do it (I have ported a couple of scripts to AutoIt and built an installer using AutoIt. It's pretty great):
1) The least portable part of this is the mdls command, which is unix/OSX only, there is no Windows version, but there may be Windows alternatives. More on that in a bit....
2) All grep and awk are doing is parsing what mdls returns for output. This can be done in a number of ways, including just coding it in AutoIt. Yes, if I were doing this, I could find Windows versions or alternatives to grep and awk (windows grep and gawk, for example), but they are doing relatively simple things, and I might just do that in AutoIt (for learning AutoIt's language, see http://www.autoitscript.com/autoit3/docs/). It's probably not worth finding the exact Windows versions of grep and awk -- just use AutoIt's own language for that.
Anyway, mdls grabs all the pdf's metadata. It is text like:
kMDItemAuthors = (
"Flint, Herbert L. (Herbert Longley), b. 1862"
)
kMDItemContentCreationDate = 2013-10-16 16:07:51 -0700
kMDItemContentModificationDate = 2013-10-16 16:07:52 -0700
kMDItemContentType = "com.adobe.pdf"
kMDItemContentTypeTree = (
"com.adobe.pdf",
"public.data",
"public.item",
"public.composite-content",
"public.content"
)
kMDItemCreator = "Digitized by the Internet Archive"
kMDItemDisplayName = "practicalinstruc00flin.pdf"
kMDItemEncodingApplications = (
"Recoded by LuraDocument PDF v2.53"
)
kMDItemFSContentChangeDate = 2013-10-16 16:07:52 -0700
kMDItemFSCreationDate = 2013-10-16 16:07:51 -0700
kMDItemFSCreatorCode = ""
kMDItemFSFinderFlags = 0
kMDItemFSHasCustomIcon = 0
kMDItemFSInvisible = 0
kMDItemFSIsExtensionHidden = 0
kMDItemFSIsStationery = 0
kMDItemFSLabel = 0
kMDItemFSName = "practicalinstruc00flin.pdf"
kMDItemFSNodeCount = 0
kMDItemFSOwnerGroupID = 20
kMDItemFSOwnerUserID = 501
kMDItemFSSize = 12668272
kMDItemFSTypeCode = ""
kMDItemKeywords = (
"http://archive.org/details/practicalinstruc00flin"
)
kMDItemKind = "PDF"
kMDItemLastUsedDate = 2013-10-30 20:00:54 -0700
kMDItemNumberOfPages = 210
kMDItemPageHeight = 532
kMDItemPageWidth = 370
kMDItemSecurityMethod = "None"
kMDItemTitle = "Practical instruction in hypnotism and suggestion"
kMDItemUsedDates = (
"2013-10-16 00:00:00 -0700",
"2013-10-17 00:00:00 -0700",
"2013-10-30 00:00:00 -0700"
)
kMDItemVersion = "1.5"
In the OSX/Unix code, grep is being used to return the lines containing the height and width; awk is further formatting and outputting the lines of that output. All this could be done in AppleScript or AutoIt -- you don't really need grep and awk to do that. You just need to use AutoIt's string functions, like http://www.autoitscript.com/autoit3/docs/functions/StringSplit.htm
So, how to return the initial metadata? Maybe this would work (there may be others, but I'm needing to rush right now): http://www.pdflabs.com/tools/pdftk-server/
Upvotes: 1