<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule"
	>
<channel>
	<title>Comments on: PDF, The Waste Land, and Monica&#8217;s Blue Dress</title>
	<atom:link href="http://www.robweir.com/blog/2007/11/pdf-waste-land-and-monicas-blue-dress.html/feed" rel="self" type="application/rss+xml" />
	<link>http://www.robweir.com/blog/2007/11/pdf-waste-land-and-monicas-blue-dress.html</link>
	<description>Thinking the unthinkable, pondering the imponderable, effing the ineffable and scruting the inscrutable</description>
	<lastBuildDate>Wed, 17 Mar 2010 16:12:36 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Anonymous</title>
		<link>http://www.robweir.com/blog/2007/11/pdf-waste-land-and-monicas-blue-dress.html#comment-1312</link>
		<dc:creator>Anonymous</dc:creator>
		<pubDate>Wed, 05 Dec 2007 23:14:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.robweir.com/blog/2007/11/pdf-the-waste-land-and-monicas-blue-dress.html#comment-1312</guid>
		<description>&gt;&gt; What do you mean it doesn&#039;t look like the document? &lt;&lt;&lt;br/&gt;Exactly what I said. Every now and again some objects are lost during transformation to pdf.&lt;br/&gt;&lt;br/&gt;&gt;&gt; Rob&#039;s idea is to have the same program store a PDF of its own output. &lt;&lt;&lt;br/&gt;For one thing, &lt;i&gt;I was talking primary of James King&#039;s ideas&lt;/i&gt;, not Rob&#039;s. Rob&#039;s approach is akin caching — all data are preserved, quick read-only pdf is available. Which is quite acceptable if one can afford space etc.&lt;br/&gt;&lt;br/&gt;&gt;&gt; If those two weren&#039;t the same, it&#039;d be a bug. But I don&#039;t see how it could. &lt;&lt;&lt;br/&gt;What?! You&#039;ve never seen a bug? Lucky you! Will it make anybody happier to learn of data loss because of the bug?&lt;br/&gt;&lt;br/&gt;How bug can happen:&lt;br/&gt;&lt;br/&gt;* many things change. Some of them are external to the format. Like font engine. For example, Adobe Acrobat does carry its own. Is there any guaranty that three centuries in the future somebody will know all bugs and workarounds of closed-source software?&lt;br/&gt;&lt;br/&gt;* printscreen is not 100% valid copy of the screen. Try to get one of media player. Anyway, it is quite impractical: plainly too big  and not scalable.&lt;br/&gt;&lt;br/&gt;* I have seen encoding bugs in the programs, when in some specific situation wrong translation was used. I&#039;ve seen it happen during pdf conversion as well.&lt;br/&gt;&lt;br/&gt;* image / graphics transformation might be lossy. It can be because of the underlying graphics model (pdf does not support arcs of circle, does it? but it is irrelevant: it does not support 5-th order splines), or because of color space. Or something else. This is acceptable limitation for pdf, after all it is &lt;i&gt;presentation&lt;/i&gt; format. But I can easily imagine a bug in transformation algorithms, esp. connected to overflows (so it rarely happens).</description>
		<content:encoded><![CDATA[<p>>> What do you mean it doesn&#8217;t look like the document? < <<br/>Exactly what I said. Every now and again some objects are lost during transformation to pdf.</p>
<p>>> Rob&#8217;s idea is to have the same program store a PDF of its own output. < <<br/>For one thing, <i>I was talking primary of James King&#8217;s ideas</i>, not Rob&#8217;s. Rob&#8217;s approach is akin caching — all data are preserved, quick read-only pdf is available. Which is quite acceptable if one can afford space etc.</p>
<p>>> If those two weren&#8217;t the same, it&#8217;d be a bug. But I don&#8217;t see how it could. < <<br/>What?! You&#8217;ve never seen a bug? Lucky you! Will it make anybody happier to learn of data loss because of the bug?</p>
<p>How bug can happen:</p>
<p>* many things change. Some of them are external to the format. Like font engine. For example, Adobe Acrobat does carry its own. Is there any guaranty that three centuries in the future somebody will know all bugs and workarounds of closed-source software?</p>
<p>* printscreen is not 100% valid copy of the screen. Try to get one of media player. Anyway, it is quite impractical: plainly too big  and not scalable.</p>
<p>* I have seen encoding bugs in the programs, when in some specific situation wrong translation was used. I&#8217;ve seen it happen during pdf conversion as well.</p>
<p>* image / graphics transformation might be lossy. It can be because of the underlying graphics model (pdf does not support arcs of circle, does it? but it is irrelevant: it does not support 5-th order splines), or because of color space. Or something else. This is acceptable limitation for pdf, after all it is <i>presentation</i> format. But I can easily imagine a bug in transformation algorithms, esp. connected to overflows (so it rarely happens).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Anonymous</title>
		<link>http://www.robweir.com/blog/2007/11/pdf-waste-land-and-monicas-blue-dress.html#comment-1280</link>
		<dc:creator>Anonymous</dc:creator>
		<pubDate>Tue, 27 Nov 2007 00:08:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.robweir.com/blog/2007/11/pdf-the-waste-land-and-monicas-blue-dress.html#comment-1280</guid>
		<description>What do you mean it doesn&#039;t look like the document?  Rob&#039;s idea is to have the same program store a PDF of its own output.&lt;br/&gt;&lt;br/&gt;In other words, the same program would be embedding the PDF and displaying the data.  If those two weren&#039;t the same, it&#039;d be a bug.  But I don&#039;t see how it could.  After all, all it would have to do is a glorified printscreen of itself and stuff that into a PDF.</description>
		<content:encoded><![CDATA[<p>What do you mean it doesn&#8217;t look like the document?  Rob&#8217;s idea is to have the same program store a PDF of its own output.</p>
<p>In other words, the same program would be embedding the PDF and displaying the data.  If those two weren&#8217;t the same, it&#8217;d be a bug.  But I don&#8217;t see how it could.  After all, all it would have to do is a glorified printscreen of itself and stuff that into a PDF.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Anonymous</title>
		<link>http://www.robweir.com/blog/2007/11/pdf-waste-land-and-monicas-blue-dress.html#comment-1278</link>
		<dc:creator>Anonymous</dc:creator>
		<pubDate>Mon, 26 Nov 2007 15:56:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.robweir.com/blog/2007/11/pdf-the-waste-land-and-monicas-blue-dress.html#comment-1278</guid>
		<description>For me the idea to store PDF is just weird. Working with &lt;i&gt;data archiving&lt;/i&gt; for  process control I learned hard way that the only acceptable way to archive is to store original data. Nobody can guaranty fidelity of the transformation. You give many examples of data loss, but most general holds: there is no way to guaranty that PDF &lt;i&gt;even&lt;/i&gt; looks exactly like what person had in document.</description>
		<content:encoded><![CDATA[<p>For me the idea to store PDF is just weird. Working with <i>data archiving</i> for  process control I learned hard way that the only acceptable way to archive is to store original data. Nobody can guaranty fidelity of the transformation. You give many examples of data loss, but most general holds: there is no way to guaranty that PDF <i>even</i> looks exactly like what person had in document.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Thomas Downing</title>
		<link>http://www.robweir.com/blog/2007/11/pdf-waste-land-and-monicas-blue-dress.html#comment-1277</link>
		<dc:creator>Thomas Downing</dc:creator>
		<pubDate>Mon, 26 Nov 2007 12:30:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.robweir.com/blog/2007/11/pdf-the-waste-land-and-monicas-blue-dress.html#comment-1277</guid>
		<description>Although I pay the bills with technical endeavour, my personal bent is historical.  This post resonates with strongly with me; further, I think it paints a strong picture of one of the strengths of ODF as a standard.&lt;br/&gt;&lt;br/&gt;The idea of storing the &#039;as published&#039; form of the document using PDF as a part of the larger &#039;document as work&#039; is a crucial feature to ODF as an archival tool.  Great suggestion!  I hope it gets added to the ODF track soon.</description>
		<content:encoded><![CDATA[<p>Although I pay the bills with technical endeavour, my personal bent is historical.  This post resonates with strongly with me; further, I think it paints a strong picture of one of the strengths of ODF as a standard.</p>
<p>The idea of storing the &#8216;as published&#8217; form of the document using PDF as a part of the larger &#8216;document as work&#8217; is a crucial feature to ODF as an archival tool.  Great suggestion!  I hope it gets added to the ODF track soon.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Lucas</title>
		<link>http://www.robweir.com/blog/2007/11/pdf-waste-land-and-monicas-blue-dress.html#comment-1276</link>
		<dc:creator>Lucas</dc:creator>
		<pubDate>Sat, 24 Nov 2007 07:20:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.robweir.com/blog/2007/11/pdf-the-waste-land-and-monicas-blue-dress.html#comment-1276</guid>
		<description>Not that the ODF vs OOXML is in question her, but could one not do the same thing with OOXML?&lt;br/&gt;&lt;br/&gt;There are so many issues at work when it comes to &quot;archiving&quot;, I can&#039;t imagine any file format solution that is clearly any better than others. For example, are there any content recovery &quot;mechanisms&quot; inherent in any file formats? Suppose a file is saved/archived incorrectly because of a software bug, how easily can the content by recovered?&lt;br/&gt;&lt;br/&gt;How can scanned elements and the text documents be &quot;archived&quot; without keeping two copies? Legal documents with signatures  may need to be preserved with hand-written signatures but keep the text of the document in original form so as to remain searchable. One might argue, one can simply scan the signature page while keeping the original text document. However, this introduces: 1) two files, 2) legal question about whether signature corresponds to the legal text.</description>
		<content:encoded><![CDATA[<p>Not that the ODF vs OOXML is in question her, but could one not do the same thing with OOXML?</p>
<p>There are so many issues at work when it comes to &#8220;archiving&#8221;, I can&#8217;t imagine any file format solution that is clearly any better than others. For example, are there any content recovery &#8220;mechanisms&#8221; inherent in any file formats? Suppose a file is saved/archived incorrectly because of a software bug, how easily can the content by recovered?</p>
<p>How can scanned elements and the text documents be &#8220;archived&#8221; without keeping two copies? Legal documents with signatures  may need to be preserved with hand-written signatures but keep the text of the document in original form so as to remain searchable. One might argue, one can simply scan the signature page while keeping the original text document. However, this introduces: 1) two files, 2) legal question about whether signature corresponds to the legal text.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Uri</title>
		<link>http://www.robweir.com/blog/2007/11/pdf-waste-land-and-monicas-blue-dress.html#comment-1275</link>
		<dc:creator>Uri</dc:creator>
		<pubDate>Thu, 22 Nov 2007 20:58:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.robweir.com/blog/2007/11/pdf-the-waste-land-and-monicas-blue-dress.html#comment-1275</guid>
		<description>I can&#039;t really see an argument with two sides here. Using PDF exclusively for archiving sound to me like a cooking book with only pictures of the finished dishes and no recipes, or an exhibition of photocopies of paintings. It&#039;s substituting a facsimile copy of a thing with the thing itself.&lt;br/&gt;If I need archived financial data, I expect it to be in a format that can be used for financial calculations. Archived computer source-code should be readable to a compiler. If I need a graph from an academic paper, I&#039;d want to extract the full-resolution one from the original document, not the miniaturized version created for printing. A printed version of data is almost always useless compared to the data in its original format and container.&lt;br/&gt;No one is seriously suggesting that  graphic design studios start to archive their Photoshop projects as JPEG images. Why should office documents be treated differently?</description>
		<content:encoded><![CDATA[<p>I can&#8217;t really see an argument with two sides here. Using PDF exclusively for archiving sound to me like a cooking book with only pictures of the finished dishes and no recipes, or an exhibition of photocopies of paintings. It&#8217;s substituting a facsimile copy of a thing with the thing itself.<br />If I need archived financial data, I expect it to be in a format that can be used for financial calculations. Archived computer source-code should be readable to a compiler. If I need a graph from an academic paper, I&#8217;d want to extract the full-resolution one from the original document, not the miniaturized version created for printing. A printed version of data is almost always useless compared to the data in its original format and container.<br />No one is seriously suggesting that  graphic design studios start to archive their Photoshop projects as JPEG images. Why should office documents be treated differently?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Rob</title>
		<link>http://www.robweir.com/blog/2007/11/pdf-waste-land-and-monicas-blue-dress.html#comment-1274</link>
		<dc:creator>Rob</dc:creator>
		<pubDate>Thu, 22 Nov 2007 15:39:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.robweir.com/blog/2007/11/pdf-the-waste-land-and-monicas-blue-dress.html#comment-1274</guid>
		<description>The idea would be that your &quot;original  original&quot; would be a compound document that had both ODF and ODF markups.  &lt;br/&gt;&lt;br/&gt;Also, the interest here could be beyond archiving.  It would be a document that could be viewed anywhere (presumption is PDF readers are free and ubiquitous)with perfect fidelity, as well as edited anywhere (assumption that ODF editors are free and available everywhere.)</description>
		<content:encoded><![CDATA[<p>The idea would be that your &#8220;original  original&#8221; would be a compound document that had both ODF and ODF markups.  </p>
<p>Also, the interest here could be beyond archiving.  It would be a document that could be viewed anywhere (presumption is PDF readers are free and ubiquitous)with perfect fidelity, as well as edited anywhere (assumption that ODF editors are free and available everywhere.)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Anonymous</title>
		<link>http://www.robweir.com/blog/2007/11/pdf-waste-land-and-monicas-blue-dress.html#comment-1273</link>
		<dc:creator>Anonymous</dc:creator>
		<pubDate>Thu, 22 Nov 2007 12:04:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.robweir.com/blog/2007/11/pdf-the-waste-land-and-monicas-blue-dress.html#comment-1273</guid>
		<description>That won&#039;t work. When you archive originals, the legal requirement is often to keep the *original* original, and you can&#039;t expect that all the files you get handed to archive will have been prepared for you specially. Hence the pdf/a, odf pair would be kept separately.&lt;br/&gt;&lt;br/&gt;In fact sometimes more formats need to be kept: the original, the rendered document (maybe pdf, maybe open standards for non-paper content like audio, video), and occasionally an additional searchable format is needed (plain text ocr&#039;d from a tiff; a timestamped transcript of a video; etc).&lt;br/&gt;&lt;br/&gt;ODF is an improvement on the past, where we have to archive not just the document but the proprietary application that read it; but its not going to solve this problem in the large. PDF/A isnt a panacea either, but its a reasonable alternative to paper.</description>
		<content:encoded><![CDATA[<p>That won&#8217;t work. When you archive originals, the legal requirement is often to keep the *original* original, and you can&#8217;t expect that all the files you get handed to archive will have been prepared for you specially. Hence the pdf/a, odf pair would be kept separately.</p>
<p>In fact sometimes more formats need to be kept: the original, the rendered document (maybe pdf, maybe open standards for non-paper content like audio, video), and occasionally an additional searchable format is needed (plain text ocr&#8217;d from a tiff; a timestamped transcript of a video; etc).</p>
<p>ODF is an improvement on the past, where we have to archive not just the document but the proprietary application that read it; but its not going to solve this problem in the large. PDF/A isnt a panacea either, but its a reasonable alternative to paper.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

<!-- Dynamic Page Served (once) in 0.441 seconds -->
