{"id":1757,"date":"2014-05-04T00:00:00","date_gmt":"2014-05-03T22:00:00","guid":{"rendered":"https:\/\/wwwneu.strehle.de\/tim\/weblog\/archives\/2014\/05\/04\/1714\/"},"modified":"2025-07-31T22:03:36","modified_gmt":"2025-07-31T20:03:36","slug":"1714","status":"publish","type":"post","link":"https:\/\/www.strehle.de\/tim\/weblog\/archives\/2014\/05\/04\/1714\/","title":{"rendered":"How I archive Web pages in the DAM (screencast)"},"content":{"rendered":"\n<p>Since July 2011, I\u2019ve been archiving interesting Web pages in my personal instance of DC-X (the Digital Asset Management system our company is building). My archive contains 12,300 pages already and is growing daily.<\/p>\n\n\n\n<p>I\u2019m totally in love with this feature: It\u2019s my \u201cprivate file and library\u201d (a quote from Vannevar Bush\u2019s 1945 <a href=\"http:\/\/www.theatlantic.com\/magazine\/archive\/1945\/07\/as-we-may-think\/303881\/4\/\">As We May Think<\/a>) \u2013 a highly relevant, searchable pool of content I might want to revisit or read later. In an instant, I get back to that great or helpful article when I need it. It\u2019s also a tool for <a href=\"\/tim\/weblog\/archives\/2012\/09\/05\/1524\">curating the links<\/a> I\u2019m publishing here. And finally, a backup for the day when these articles vanish from the Web or the links to them break (sooner or later, this happens to most of them).<\/p>\n\n\n\n<p>The alternatives don\u2019t cut it for me: Browser bookmarks or Safari\u2019s \u201creading list\u201d don\u2019t scale well to 10,000 pages, and have very limited search\/browse functionality. Services like Delicious or Pinterest can\u2019t be trusted with an archive (which I expect to last for decades). And software that does the archiving from a server process doesn\u2019t see the page exactly as I\u2019m seeing it, and fails at sites that require authentication.<\/p>\n\n\n\n<p>I couldn\u2019t build up this archive if the process wasn\u2019t quick and easy (no metadata entry required). It requires a small Firefox add-on that I custom-built for myself (no customers are using this feature yet). The browser add-on takes a screenshot of the currently displayed page and posts it, along with the HTML source code, to the DAM in a new browser tab. The DC-X DAM asks me to log in (only once per day), creates an import job and waits for its completion. Then I\u2019m redirected to the details page of the \u201carchived Web page\u201d document that was just created. Here\u2019s <a href=\"http:\/\/vimeo.com\/93887839\">a screencast<\/a>:<\/p>\n\n\n\n<p> <iframe loading=\"lazy\" allowfullscreen=\"allowfullscreen\" frameborder=\"0\" height=\"282\" mozallowfullscreen=\"mozallowfullscreen\" src=\"\/\/player.vimeo.com\/video\/93887839\" webkitallowfullscreen=\"webkitallowfullscreen\" width=\"500\">&#8230;<\/iframe>How are you keeping track of important Web pages? What\u2019s your personal digital archiving workflow?<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Since July 2011, I\u2019ve been archiving interesting Web pages in my personal instance of DC-X (the Digital Asset Management system our company is building). My archive contains 12,300 pages already and is growing daily. I\u2019m totally in love with this feature: It\u2019s my \u201cprivate file and library\u201d (a quote from Vannevar Bush\u2019s 1945 As We [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":"","_share_on_mastodon":"0"},"categories":[1],"tags":[],"class_list":["post-1757","post","type-post","status-publish","format-standard","hentry","category-weblog"],"share_on_mastodon":{"url":"","error":""},"_links":{"self":[{"href":"https:\/\/www.strehle.de\/tim\/wp-json\/wp\/v2\/posts\/1757","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.strehle.de\/tim\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.strehle.de\/tim\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.strehle.de\/tim\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.strehle.de\/tim\/wp-json\/wp\/v2\/comments?post=1757"}],"version-history":[{"count":1,"href":"https:\/\/www.strehle.de\/tim\/wp-json\/wp\/v2\/posts\/1757\/revisions"}],"predecessor-version":[{"id":1925,"href":"https:\/\/www.strehle.de\/tim\/wp-json\/wp\/v2\/posts\/1757\/revisions\/1925"}],"wp:attachment":[{"href":"https:\/\/www.strehle.de\/tim\/wp-json\/wp\/v2\/media?parent=1757"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.strehle.de\/tim\/wp-json\/wp\/v2\/categories?post=1757"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.strehle.de\/tim\/wp-json\/wp\/v2\/tags?post=1757"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}