PDF with XMP Metadata

On November 07 2008, I have received an email from my Department Manager asked me to look into the possibilities of adding XMP Metadata in the PDF files which generated from 3B2 source files. This requirement had been come to us from one of our Customer.

What is XMP?

“The Adobe Extensible Metadata Platform (XMP) is a standard, created by Adobe Systems Inc., for processing and storing standardized and proprietary information relating to the contents of a file.
XMP standardizes the definition, creation, and processing of extensible metadata. Serialized XMP can be embedded into a significant number of popular file formats, without breaking their readability by non-XMP-aware applications. Embedding metadata (“the truth is in the file”) avoids many problems that occur when metadata is stored separately. XMP is used in PDF, photography and photo editing applications.”
From: http://en.wikipedia.org/wiki/Extensible_Metadata_Platform

3B2 and XMP

In this time, actually XMP is a new concept as per my knowledge none of the Typesetting software supports this option in creating PDF from its source files. I have checked this with Adobe InDesign also.

In order to confirm this from the manufacturers of 3B2, I have logged a call in PTC on 14 November 2008:

Hi,

we have a specific requirement from one of our Customer. Customer states that:

“We are looking to add XMP metadata across all PDFs and I wanted to make an initial contact with you to see what *** can do for us. (We want the XMP metadata to be semantically relevant – rather than the default properties that tend to get shipped – so that third party applications will be able to use this data intelligently.)
I’ve attached a supplier spec which should give you an idea of what we are aiming for.
I imagine there will be questions and would happy to discuss further with you.”

I have tried our 3B2 version “Document Properties”, but I am not able to load all the details into the pdf xmp metadata. Also for the “Document Properties” I am not able to find detailed Documentation with sample (apart from one page eagle eye document on Help Center).

I have attached customers sample “xmp” file along with this mail. Please advice us whether it is possible to achieve the complete requirement through 3B2? Also if there is any complete Documentation with sample please inform or supply to us.

Regards,
Srikrishnan

Their final reply is as follows:

Dear Srikrishnan,

I have had confirmation from the development team that we do not currently support XMP metadata.
There is a way on the PTC Technical Support page for you to log an ‘Enhancement Request’, if you log the request for this functionality via this procedure the Product Manager will see the request and determine if it will be implemented in the future.

Perl and XMP

In between the above discussions, I have look into the alternate ways for achieving the customer’s requirement. In my surfing I have found two options before me
1. Adobe XMP Toolkit
2. Perl Module “Exiftool”
For the first one I need to learn some other programming languages which are not learned by me. But for the second option, I have confident on me, that I have able to write a perl script by using the perl module “Exiftool”.

I am working in Perl since 2005. Though I am not properly learned Perl, but through my requirements I selfly learned Perl in practice. I have purchased two Books, “Perl for Beginners” which I learned in linear way, the other Book is “Black book” which I have used as a reference one, whenever required, and I read the relevant chapters. But I learned Perl mainly through online. With the help of Google I googled and learned many things. Very active forum such as “Perl Monks” help me a lot. Unless “Perl Monks” long before itself, I got frustrated and left perl. In so many cases I stucked up in between a program without knowing how to achieve the current output and proceed further, at that junctures Google and Perl Monks help are immense. The other main source of my perl learning is my colleagues; in my office we have a Perl Department, many interesting people, worked and working in that Department. But an interesting thing is all of them are self learned (even not read a single book in Perl) experts. Their way of coding, methods are entirely different from me. They all have some common natures among themselves. Because they have learned, modified and using perl from their predecessors’ scripts.

So I have choosed the second way for creating a tool for adding XMP Metadata in the PDF file which generated from 3d file (APP (formerly 3B2)). My plan is after creating the PDF file. If the user executes my perl program – I have developed most of my perl programs with Tk Graphical User Interface – it prompts for the pdf and its relevant 3d file. If they selected those two, then the perl program will collect the necessary details from the 3d file such as Article title, Author names, Page numbers, Doi number, ISSN numbers, etc. and create a XMP metadata from the collected details and add it in the PDF through Exiftool module.

For this plan, I read Exiftool module which was created by Phil Harvey and XMP. I tried to write the perl program. Things are coming as well as my expectation, because of my depth analysation, clear understanding about my requirements and concrete ideas about my tool and its working procedure. In my first attempt itself 90% of my requirement has been fulfilled in my Program.

But certain points, such as Customer were very specific about the namespace “XMP”, but in my initial PDF namespace “XAP” has been appeared. Also, XMP:identifier and Prism:url are not able to be fixed by myself.

In this situation, I am not able to get answer from anybody even in “Perl Monks” forum. So I decided to contact the Author Phil Harvey. Either in the CPAN page or Documentation anywhere his email id was not mentioned. So I was struggled a lot to get his email id and sent my doubts to him, along with my perl program on December 01 2008. Initially I don’t have any faith in my mail. Because I feel such a great person’s never read or allot time to answer such a silly questions for the beginners like me. But as a great surprise for me, he immediately replied to my doubts on the day itself, ie. I have sent my mail on 2:19:00PM he replied to me on 6:13:52 PM. His mail itself gave me a great moral boost for me. It makes me to feel that such great Perl personality’s recognized me. This mail gives me more confident and more interest towards Perl. Certainly after that I learned many things in perl very easily because of the confident I gained from such an instance. After that first mail again and again I have sent six more mails regarding various problems I faced in my XMP project’s next next steps. He answered me sincerely with great patience. I really thank him through my mail whole heartedly. I learned many things from him. Apart from my question, he has suggested many things and given me samples.

He has answered for my below questions in all his six mails:

1. How to change XMP namespace instead of XAP namespace?
2. How to add XMP:identifier?
3. How to add Prism:url?
4. How to create exe file using PAR module?
5. How to make Linearization of XMP metadata using Exiftool?
6. How to add utf-8 characters?

For most of my questions, I got solution. But for changing the name space XAP to XMP I am not able to find a solution in the method Phil Harvey suggested so I have modified the pdf by find and replace through perl without corrupting the PDF. Also for Linearization problem Phil himself mentioned that it is not possible through “Exiftool”. So I have mentioned this to Customer and they also accept it for time being.

After lot of fixes my final release was on 14 August 2009, from that day onwards so far there is no bug or problem raised by the users, its going on smoothly.

Advertisements

3 thoughts on “PDF with XMP Metadata

  1. Hi Krish,

    Just now I have started my career as 3b2 paginator. could you please give me some useful tips and meterials.

    Thanks,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s