Reading JPG file's XMP metadata

Ruuhkis picture Ruuhkis · Apr 23, 2014 · Viewed 10.5k times · Source

I am developing Android application that's supposed to make use of Google Camera's new depth map generation feature.

Basically Google has described the meta data used here

I can access to most of the metadata, but unfortunately the most important data is encoded as extendedXmp, and I can't get any XMP parsing library to parse it correctly!

I've tried Commons-Imaging, metadata-extractor and most recently Adobes XMPCore

XMPCore might be able to handle the extended version, but there's no documentation how can I get it to parse the data from JPG file, its assuming raw XMP data to be passed

Is there any correct implementation of XMP parsing that includes the extended parts of JPG files or am I just doing something wrong?

Here's my tries:

With Commons-Imaging:

                try {
                    String imageParser = new JpegImageParser().getXmpXml(new ByteSourceInputStream(imageStream, "img.jpg"), new HashMap<String, Object>());

                    Log.v(TAG, imageParser);

                } catch (ImageReadException e1) {
                    // TODO Auto-generated catch block
                    e1.printStackTrace();
                }

With metadata-extractor

                Metadata metadata = ImageMetadataReader.readMetadata(
                        new BufferedInputStream(imageStream), false);


                XmpDirectory xmp = metadata
                        .getDirectory(XmpDirectory.class);
                XMPMeta xmpMeta = xmp.getXMPMeta();



                String uri = "http://ns.google.com/photos/1.0/depthmap/";

                Log.v(TAG, xmpMeta.doesPropertyExist(uri, "GDepth:Format") + " " );

                try {
                    XMPProperty hasExtendedXMP = xmpMeta.getProperty("http://ns.adobe.com/xmp/note/", "xmpNote:HasExtendedXMP");

                    Log.v(TAG, hasExtendedXMP.getValue().toString() + " " + new String(Base64.decode(hasExtendedXMP.getValue().toString(), Base64.DEFAULT)));

                } catch (XMPException e) {
                    e.printStackTrace();
                }

Answer

dragon66 picture dragon66 · Mar 1, 2015

Initially, Adobe didn't expect the XMP data length would exceed the limit of one JPEG segment (about 64K) and their XMP specification stated the XMP data must fit into one. Later when they found a single JPEG APP1 segment is not large enough to hold the XMP data, they changed their specification to allow for multiple APP1 segments for the whole XMP data. The data is split into two parts: the standard XMP and the ExtendedXMP. The standard XMP part is a "normal" XMP structure with a package wrapper while the ExtendedXMP part doesn't have a package wrapper. The ExtendedXMP data can be further divided to fit into multiple APP1.

The following quote is from Adobe XMP specification Part 3 for ExtendedXMP chunks as JPEG APP1:

Each chunk is written into the JPEG file within a separate APP1 marker segment. Each ExtendedXMP marker segment contains:

  • A null-terminated signature string of "http://ns.adobe.com/xmp/extension/".
  • A 128-bit GUID stored as a 32-byte ASCII hex string, capital A-F, no null termination. The GUID is a 128-bit MD5 digest of the full ExtendedXMP serialization.
  • The full length of the ExtendedXMP serialization as a 32-bit unsigned integer
  • The offset of this portion as a 32-bit unsigned integer.
  • The portion of the ExtendedXMP

We can see besides the null-terminated string as an id for the ExtendedXMP data, there is also a GUID which should be the same value as the one found in the standard XMP part. The offset is used to join the different parts of the ExtendedXMP - so the sequence for the ExtendedXMP APP1 may not even be in order. Then come the actual data part and this is why @Matt's answer need some way to fix the string. There is another value - full length of the ExtendedXMP serialization which serves two purposes: check the integrity of the data as well as provides the buffer size for joining the data.

When we found a ExtendedXMP segment, we need to join the current data with the other ExtendedXMP segments and finally got the whole ExtendedXMP data. We then join the two XML tree together (removing the GUID from the standard XMP part as well) to retrieve the entire XMP data.

I have made a library icafe in Java which can extract and insert XMP as well as ExtendedXMP. One of the usecase for the ExtendedXMP is for Google's depth map data which in fact is a grayscale image hidden inside the actual image as a metadata, and in the case of JPEG, as XMP data. The depth map image could be used for example to blur the original image. The depth map data are usually large and have to be split into standard and extended XMP parts. The whole data is Base64 encoded and could be in PNG format.

The following is an example image and the extracted depth map:

enter image description here

The original image comes from here.

Note: Recently I found another website talking about Google Cardboard Camera app which can take advantage of both the image and audio embedded in the JPEG XMP data. ICAFE now supports both image and audio extraction from such images. Example usage can be found here with the following call JPEGTweaker.extractDepthMap()

Here is the image extracted by ICAFE from the original image on the website talking about Google Cardboard Camera app:

enter image description here

Unfortunately, I can't find a way to insert the MP4 audio here.