{"id":1796,"date":"2015-04-13T00:00:00","date_gmt":"2015-04-12T22:00:00","guid":{"rendered":"https:\/\/wwwneu.strehle.de\/tim\/weblog\/archives\/2015\/04\/13\/1758\/"},"modified":"2015-04-13T00:00:00","modified_gmt":"2015-04-12T22:00:00","slug":"1758","status":"publish","type":"post","link":"https:\/\/www.strehle.de\/tim\/weblog\/archives\/2015\/04\/13\/1758\/","title":{"rendered":"Is there an XML standard for digital magazine replicas?"},"content":{"rendered":"<p>Many printed newspapers and magazines offer <strong>digital replicas<\/strong> to their subscribers \u2013 Web or mobile apps that let readers browse the publication in the exact print layout. Often with added functionality, like fulltext search, PDF download or an optional HTML-formatted article view for better readability. You\u2019ll find lots of examples in the Apple Newsstand or Google Play Kiosk. In Germany, these digital replicas are called \u201cePaper\u201d and are a <strong>must-have<\/strong> for publishers because they count towards the official print circulation figures <a href=\"http:\/\/www.ivw.eu\/print\/epaper\/epaper\">tracked by the IVW<\/a>.<\/p>\n<p>Technically, replica editions are usually built from <strong>PDF files<\/strong> of the printed pages. A decent editorial system will also provide <strong>articles and images<\/strong> with structured metadata separately, which means better quality for added functionality compared to content extracted from the PDF. Really good systems can provide <strong>page coordinates<\/strong> for articles and images, so that a tap or click on the page can send the reader to the right article or image. (Remember the good old HTML image map?) Companies like <a href=\"http:\/\/www.visiolink.com\/\">Visiolink<\/a>, <a href=\"http:\/\/www.1000grad-epaper.de\/\">1000\u00b0ePaper<\/a>, <a href=\"http:\/\/iapps-technologies.com\/\">iApps<\/a> or <a href=\"http:\/\/paperlit.com\/\">Paperlit<\/a> help publishers create and publish replicas.<\/p>\n<p>Since our DC-X DAM is used as a PDF, article and image archive by many newspaper and magazine publishers, we often have to make it interoperable with \u201cePaper systems\u201d. (We even built one or two of these systems ourselves.) The main work is in <strong>formatting and packaging<\/strong> page, article and image contents and metadata the way the ePaper system needs it. And sometimes we\u2019re on the receiving end, having to ingest such a feed into the DAM.<\/p>\n<p>To clarify, here\u2019s the information that needs to be transported:<\/p>\n<ul>\n<li>\n<strong>Edition<\/strong>\/issue level: One object per printed edition of a newspaper or magazines. Properties: Edition name (\u201cMy magazine\u201d), publication date, month, or issue number, page count<\/li>\n<li>\n<strong>Page<\/strong> level: One object per page (or spread, i.e. two adjacent pages), linked to the printed edition it\u2019s in (see above). Properties: Reference to PDF file, page number, size \/ physical dimensions, section<\/li>\n<li>\n<strong>Article<\/strong> level: One object per article, linked to the pages it appears on. Properties: Title (for table of contents), formatted text (ideally HTML or XHTML), coordinates on the page<\/li>\n<li>\n<strong>Media<\/strong> level: One object per image, linked to the articles it appears in. Properties: Reference to media file, title, dimensions, content type<\/li>\n<\/ul>\n<p>What\u2019s really painful is that we\u2019ve been doing this kind of integration work for almost two decades, and we keep writing <strong>customer-specific code<\/strong> from scratch every time because there seems to be <strong>no standardized exchange format<\/strong> for this kind of data. The <a href=\"http:\/\/www.idealliance.org\/specifications\/prism-metadata-initiative\">PRISM standards<\/a> come very close, <a href=\"https:\/\/iptc.org\/standards\/newsml-g2\/\">IPTC NewsML G2<\/a> might also work \u2013 but both seem to miss the edition-level and page-level information. Or do I miss something? What would you recommend? I\u2019d love to <a href=\"\/tim\/\">hear from you<\/a>!<\/p>\n<p><em>Update:<\/em> I found the EPUB derivative <a href=\"http:\/\/www.idealliance.org\/specifications\/openeft\">OpenEFT<\/a>, an Idealliance standard. It seems to match the use case almost perfectly. But I haven\u2019t found anyone implementing it so far. ePaper vendors don\u2019t seem to care about standardization at all. Maybe it\u2019s because US newspapers aren\u2019t into ePapers, as Mario R. Garc\u00eda writes in <a href=\"http:\/\/www.garciamedia.com\/blog\/those_european_e_papers_are_hot\">Those European e-papers are hot<\/a>?<\/p>\n<p><em>Update (2016-10-28):<\/em> Related, by Stefan Boddie \u2013 <a href=\"http:\/\/www.veridiansoftware.com\/knowledge-base\/metsalto\/\">What is METS\/ALTO?<\/a>: \u201cThe combination of METS and ALTO (often written METS\/ALTO) is the current industry standard for newspaper digitization.\u201d<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Many printed newspapers and magazines offer digital replicas to their subscribers \u2013 Web or mobile apps that let readers browse the publication in the exact print layout. Often with added functionality, like fulltext search, PDF download or an optional HTML-formatted article view for better readability. You\u2019ll find lots of examples in the Apple Newsstand or [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":"","_share_on_mastodon":"0"},"categories":[1],"tags":[],"class_list":["post-1796","post","type-post","status-publish","format-standard","hentry","category-weblog"],"share_on_mastodon":{"url":"","error":""},"_links":{"self":[{"href":"https:\/\/www.strehle.de\/tim\/wp-json\/wp\/v2\/posts\/1796","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.strehle.de\/tim\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.strehle.de\/tim\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.strehle.de\/tim\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.strehle.de\/tim\/wp-json\/wp\/v2\/comments?post=1796"}],"version-history":[{"count":0,"href":"https:\/\/www.strehle.de\/tim\/wp-json\/wp\/v2\/posts\/1796\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.strehle.de\/tim\/wp-json\/wp\/v2\/media?parent=1796"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.strehle.de\/tim\/wp-json\/wp\/v2\/categories?post=1796"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.strehle.de\/tim\/wp-json\/wp\/v2\/tags?post=1796"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}