ISBN Nummern aus PDFs auslesen

Ist eigentlich nur ein Beispiel, wie man recht ohne großen Overhead (gs mal vorausgesetzt) PDFs auslesen kann… vielleicht kennt ja jemand noch ein simplere Lösung:

–30.04.2011 hubionmac.com

–quick & dirty script to extract first ISBN-Number of a PDF file… just uses grep, maybe some nice reqexp would do a better job!

set myselection to choose file of type {“pdf”} with multiple selections allowed

set myoutput to “”

repeat with pdf_file in myselection

tell application “Finder” to set pdfname to name of (pdf_file as alias)

set pdf_file_posix to quoted form of POSIX path of (pdf_file as alias)

do shell script “”

try

–first add the path, otherwhise s2ascii will fail since it cannot find ghostscript (gs) which is also installed in /usr/local/bin (think by macports)

set ISBN_String to do shell script “PATH=\”$PATH:/usr/local/bin\”; /usr/local/bin/ps2ascii ” & pdf_file_posix & ” | grep -m 1 ISBN”

set foundLine to true

on error

display dialog “maybe \”” & pdfname & “\” does not contain a ISBN at all”

set foundLine to false

end try

if foundLine is true then

repeat with s in every word of ISBN_String

try

get s as integer

set s to s as text

set foundisbn to true

exit repeat

on error

set s to “”

end try

end repeat

end if

set myoutput to myoutput & pdfname & tab & s & return

end repeat

tell application “TextEdit”

activate

set a to make new document

set text of a to myoutput as text

end tell

Leave a Reply