{"id":1651,"date":"2013-05-08T00:00:00","date_gmt":"2013-05-07T22:00:00","guid":{"rendered":"https:\/\/wwwneu.strehle.de\/tim\/weblog\/archives\/2013\/05\/08\/1602\/"},"modified":"2013-05-08T00:00:00","modified_gmt":"2013-05-07T22:00:00","slug":"1602","status":"publish","type":"post","link":"https:\/\/www.strehle.de\/tim\/weblog\/archives\/2013\/05\/08\/1602\/","title":{"rendered":"Image metadata on the Web: URL as identifier"},"content":{"rendered":"<p>Before you start thinking about common metadata for your images (creator, date created, caption, license), first consider what I think is the most important piece of metadata: A <strong>unique identifier<\/strong> for your image. And please <strong>make it a URL<\/strong>. Why?<\/p>\n<p>First, you want to <strong>avoid duplicates in search engine results<\/strong>. You\u2019ll be using the same image on different Web pages, possibly with slight variations: Different sizes, file formats, or cropping. Which means that the URL to the image file is not the same. A unique identifier makes sure others can find out these are just renditions or variations of the same image. (Current image search engines often show lots of duplicates. If they don\u2019t make use of our nice identifiers once we add them, we can always roll our own search engine\u2026 \u263a Yes, I\u2019m serious.)<\/p>\n<p>Second reason: A well-groomed image will have lots of metadata. Temporal, geographical, creator and licensor related, subject descriptions, licensing terms. You don\u2019t want to add all this baggage to each Web page the image is used on, so you need a <strong>separate place to publish all the metadata<\/strong> for that image. And once you have it, it makes perfect sense use that place as the permanent home for your image and use its URL as the image\u2019s unique identifier.<\/p>\n<p>Suppose that you\u2019re using that URL\/identifier whenever you publish or distribute the image: You put it into your HTML, embed it into the image files, and make sure it doesn\u2019t get lost if you register the image with a registry like <a href=\"https:\/\/www.plusregistry.org\/\">PLUS<\/a> or distribute it through third parties like Flickr or Getty Images. What have you just gained? Well, now you can <strong>remain the authoritative source<\/strong> of your image\u2019s metadata! You can fix mistakes, add renditions or links or legal notes and change licensing terms at will because you\u2019re in control of that URL. (Third parties probably won\u2019t recognize your self-hosted metadata yet, but let\u2019s move into that direction.)<\/p>\n<p>To <strong>practice what I preach<\/strong>, I have added an <a href=\"http:\/\/www.w3.org\/TR\/2012\/REC-rdfa-core-20120607\/#using--resource-to-set-the-object\">RDFa resource attribute<\/a> to the HTML div containing the blog post\u2019s photo (you might want to view the HTML source code of <a href=\"\/tim\/weblog\/archives\/2013\/05\/02\/1601\">the previous post<\/a>). An example:<\/p>\n<p>&lt;div resource=&#8220;http:\/\/www.strehle.de\/tim\/data\/document\/doc69wpi6bms01kix6470q&#8220; typeof=&#8220;schema:ImageObject&#8220;&gt;<br \/>\n&lt;img src=&#8220;\/device_strehle\/dev1\/2013\/05-02\/72\/65\/file69wpi6cfox11c7cgw70q.jpg&#8220; \/&gt;<br \/>\n&lt;\/div&gt;<\/p>\n<p>With this HTML markup, I\u2019m also telling search engines that the referenced URL is about an image, using the <a href=\"http:\/\/schema.org\/ImageObject\">schema.org ImageObject<\/a> type. (I\u2019m a newbie re schema.org and RDFa, suggestions for improvement are welcome!)<\/p>\n<p>What if someone just downloads the image file, ignoring my lovingly-crafted HTML markup? I want them to see my URL as well. So I\u2019m embedding it in the XMP-plus:ImageSupplierImageID metadata field of the JPEG file using <a href=\"http:\/\/www.sno.phy.queensu.ca\/~phil\/exiftool\/#writing\">ExifTool<\/a>:<\/p>\n<p>exiftool -XMP-plus:ImageSupplierImageID=http:\/\/www.strehle.de\/tim\/data\/document\/doc69wpi6bms01kix6470q IMG_1980.jpg<\/p>\n<p>(This is just a first try, there\u2019s probably other metadata fields I should write it to. I\u2019m choosing this field for now because you can see and modify it in Photoshop: File \/ File Info\u2026 \/ IPTC Extension \/ Supplier\u2019s Image ID.)<\/p>\n<p>Note that the URL I\u2019m pointing to doesn\u2019t yet exist: I\u2019ll create that page in the next step. For now, I have just added a unique identifier that looks like a URL (so the correct name is probably URI or IRI, can\u2019t get used to that).<\/p>\n<p>For reference, here\u2019s a few other places that I don\u2019t fully understand yet, but look like they should possibly also contain the URL\/identifier if the image gets distributed in a suitable format:<\/p>\n<p><a href=\"http:\/\/www.sno.phy.queensu.ca\/~phil\/exiftool\/TagNames\/EXIF.html\">EXIF<\/a> ImageUniqueID. <a href=\"http:\/\/www.useplus.com\/useplus\/license.asp\">PLUS LDF<\/a> Terms and Conditions URL \/ Licensor Image ID \/ Copyright Owner Image ID \/ Image Creator Image ID. <a href=\"http:\/\/www.w3.org\/community\/odrl\/two\/model\/#section-22\">ODRL Asset uid<\/a>. <a href=\"http:\/\/schema.org\/Thing\">schema.org url property<\/a>. <a href=\"http:\/\/www.iptc.org\/site\/News_Exchange_Formats\/NewsML-G2\/Specification\/\">IPTC NewsML G2<\/a> newsItem guid attribute \/ web (Web address) element. <a href=\"http:\/\/www.prismstandard.org\/specifications\/\">PRISM<\/a> url element. <a href=\"http:\/\/www.adobe.com\/products\/xmp\/\">XMP<\/a> xmp:Identifier \/ xmpRights:WebStatement \/ xmpMM:DocumentID. <a href=\"http:\/\/www.dublincore.org\/documents\/dces\/\">Dublin Core Metadata Element Set<\/a> identifier.<\/p>\n<p>(I\u2019m sure there\u2019s more. Yes, this makes my head explode as well. Please tell me that it\u2019s much simpler than that.)<\/p>\n<p>What do you think? I\u2019d love to hear your feedback (<a href=\"https:\/\/twitter.com\/tistre\">@tistre on Twitter<\/a>; for e-mail addresses see my <a href=\"\/tim\/\">home page<\/a>).<\/p>\n<p><em>Update (2018-09-06):<\/em> Five years later, I still don\u2019t know\u2026 There\u2019s also <a href=\"http:\/\/ns.useplus.org\/LDF\/ldf-XMPSpecification\">plus:licensorImageID<\/a>. See also Christian Weiske \u2013 <a href=\"https:\/\/cweiske.de\/tagebuch\/exif-url.htm\">Adding the source URL to an image&#8217;s meta data<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Before you start thinking about common metadata for your images (creator, date created, caption, license), first consider what I think is the most important piece of metadata: A unique identifier for your image. And please make it a URL. Why? First, you want to avoid duplicates in search engine results. You\u2019ll be using the same [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":"","_share_on_mastodon":"0"},"categories":[1],"tags":[],"class_list":["post-1651","post","type-post","status-publish","format-standard","hentry","category-weblog"],"share_on_mastodon":{"url":"","error":""},"_links":{"self":[{"href":"https:\/\/www.strehle.de\/tim\/wp-json\/wp\/v2\/posts\/1651","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.strehle.de\/tim\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.strehle.de\/tim\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.strehle.de\/tim\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.strehle.de\/tim\/wp-json\/wp\/v2\/comments?post=1651"}],"version-history":[{"count":0,"href":"https:\/\/www.strehle.de\/tim\/wp-json\/wp\/v2\/posts\/1651\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.strehle.de\/tim\/wp-json\/wp\/v2\/media?parent=1651"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.strehle.de\/tim\/wp-json\/wp\/v2\/categories?post=1651"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.strehle.de\/tim\/wp-json\/wp\/v2\/tags?post=1651"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}