<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule"
	>
<channel>
	<title>Comments on: OOXML&#8217;s (Out of) Control Characters</title>
	<atom:link href="http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html/feed" rel="self" type="application/rss+xml" />
	<link>http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html</link>
	<description>Thinking the unthinkable, pondering the imponderable, effing the ineffable and scruting the inscrutable</description>
	<lastBuildDate>Wed, 10 Mar 2010 01:51:22 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Rob</title>
		<link>http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html#comment-1878</link>
		<dc:creator>Rob</dc:creator>
		<pubDate>Tue, 06 May 2008 13:52:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html#comment-1878</guid>
		<description>A word of advice.  If you find that using XML is &quot;mindboggling hard to use with completely arbitrary data packets&quot;, then maybe you shouldn&#039;t be using XML for that task. &lt;br/&gt;&lt;br/&gt;Just a thought.</description>
		<content:encoded><![CDATA[<p>A word of advice.  If you find that using XML is &#8220;mindboggling hard to use with completely arbitrary data packets&#8221;, then maybe you shouldn&#8217;t be using XML for that task. </p>
<p>Just a thought.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Anonymous</title>
		<link>http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html#comment-1874</link>
		<dc:creator>Anonymous</dc:creator>
		<pubDate>Tue, 06 May 2008 12:48:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html#comment-1874</guid>
		<description>I wish you wouldnt repeat propaganda lines like the following:&lt;br/&gt;&lt;br/&gt;&quot;There is a reason XML excludes these dumb terminal control codes. They are neither desired nor necessary in XML.&quot;&lt;br/&gt;&lt;br/&gt;The inability of XML at a native level to store ALL characters (and NO CDATA sections dont solve the problem) is a serious serious weakness in XML that has resulted in a) lower adoption than it should have, b) multiple other standards being used instead, and c) multiple reimplementations of differing and incompatible ways to encode an arbitrary binary data packet at the application layer so it can be stored in XML. This is an absolute nightmare for a serialization and data exchange format.&lt;br/&gt;&lt;br/&gt;One of the most common frustrations i see in the development community relating to XML is that XML is mindboggling hard to use with completely arbitrary data packets. CDATA doesnt allow nesting, and XML as a whole doesnt  allow a range of highly useful and important characters to be used in any way.&lt;br/&gt;&lt;br/&gt;Now while i agree that the way MS has done this is pretty brain damaged, the claim that XML neither needs nor wants to support arbitrary chars is just BS that does not in any way match what I hear from developers wanting to use XML.&lt;br/&gt;&lt;br/&gt;An example, a simple website want to produce an XML feed of notes entered by their users. Since these notes are user entered they can contain pretty much anything. There is NO safe way to deliver this data faithfully using XML without additional application logic to handle encoding special content.&lt;br/&gt;&lt;br/&gt;On this level XML is broken. And I think you undermine your basic argument by repeating such nonsense. And your basic argument is valid. So undermining it isnt all that clever.&lt;br/&gt;&lt;br/&gt;And YES, i do know the arguments against leaving them out. They were specious then and are specious now.</description>
		<content:encoded><![CDATA[<p>I wish you wouldnt repeat propaganda lines like the following:</p>
<p>&#8220;There is a reason XML excludes these dumb terminal control codes. They are neither desired nor necessary in XML.&#8221;</p>
<p>The inability of XML at a native level to store ALL characters (and NO CDATA sections dont solve the problem) is a serious serious weakness in XML that has resulted in a) lower adoption than it should have, b) multiple other standards being used instead, and c) multiple reimplementations of differing and incompatible ways to encode an arbitrary binary data packet at the application layer so it can be stored in XML. This is an absolute nightmare for a serialization and data exchange format.</p>
<p>One of the most common frustrations i see in the development community relating to XML is that XML is mindboggling hard to use with completely arbitrary data packets. CDATA doesnt allow nesting, and XML as a whole doesnt  allow a range of highly useful and important characters to be used in any way.</p>
<p>Now while i agree that the way MS has done this is pretty brain damaged, the claim that XML neither needs nor wants to support arbitrary chars is just BS that does not in any way match what I hear from developers wanting to use XML.</p>
<p>An example, a simple website want to produce an XML feed of notes entered by their users. Since these notes are user entered they can contain pretty much anything. There is NO safe way to deliver this data faithfully using XML without additional application logic to handle encoding special content.</p>
<p>On this level XML is broken. And I think you undermine your basic argument by repeating such nonsense. And your basic argument is valid. So undermining it isnt all that clever.</p>
<p>And YES, i do know the arguments against leaving them out. They were specious then and are specious now.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: cwitty</title>
		<link>http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html#comment-1757</link>
		<dc:creator>cwitty</dc:creator>
		<pubDate>Thu, 27 Mar 2008 20:20:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html#comment-1757</guid>
		<description>@anonymous: &lt;i&gt;Aren&#039;t some of these C0 characters printing characters in some fonts? I know they used to be in the original PCs. Maybe it&#039;s not so far-fetched that someone would use \u0008 in a document.&lt;/i&gt;&lt;br/&gt;&lt;br/&gt;If somebody has a document with an 08 in it, where the 08 is supposed to represent &quot;INVERSE BULLET&quot;, then according to &lt;a HREF=&quot;http://unicode.org/Public/MAPPINGS/VENDORS/MISC/IBMGRAPH.TXT&quot; REL=&quot;nofollow&quot; rel=&quot;nofollow&quot;&gt;this table&lt;/a&gt;, this should be mapped to the Unicode character U+25d8.</description>
		<content:encoded><![CDATA[<p>@anonymous: <i>Aren&#8217;t some of these C0 characters printing characters in some fonts? I know they used to be in the original PCs. Maybe it&#8217;s not so far-fetched that someone would use \u0008 in a document.</i></p>
<p>If somebody has a document with an 08 in it, where the 08 is supposed to represent &#8220;INVERSE BULLET&#8221;, then according to <a HREF="http://unicode.org/Public/MAPPINGS/VENDORS/MISC/IBMGRAPH.TXT" REL="nofollow" rel="nofollow">this table</a>, this should be mapped to the Unicode character U+25d8.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Anonymous</title>
		<link>http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html#comment-1750</link>
		<dc:creator>Anonymous</dc:creator>
		<pubDate>Thu, 27 Mar 2008 08:00:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html#comment-1750</guid>
		<description>&lt;i&gt;OOXML Obviously ain&#039;t XML&lt;/i&gt;&lt;br/&gt;&lt;br/&gt;Indeed, it should be called &lt;i&gt;OpenBIFF&lt;/i&gt;.&lt;br/&gt;&lt;br/&gt;Winter</description>
		<content:encoded><![CDATA[<p><i>OOXML Obviously ain&#8217;t XML</i></p>
<p>Indeed, it should be called <i>OpenBIFF</i>.</p>
<p>Winter</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Anonymous</title>
		<link>http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html#comment-1749</link>
		<dc:creator>Anonymous</dc:creator>
		<pubDate>Wed, 26 Mar 2008 22:50:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html#comment-1749</guid>
		<description>&lt;i&gt;I was wrong. Microsoft has come up with yet another non-standard escape!&lt;/i&gt;&lt;br/&gt;There! You see how we innovate!&lt;br/&gt;&lt;br/&gt;Aren&#039;t some of these C0 characters printing characters in some fonts? I know they used to be in the original PCs. Maybe it&#039;s not so far-fetched that someone would use \u0008 in a document.</description>
		<content:encoded><![CDATA[<p><i>I was wrong. Microsoft has come up with yet another non-standard escape!</i><br />There! You see how we innovate!</p>
<p>Aren&#8217;t some of these C0 characters printing characters in some fonts? I know they used to be in the original PCs. Maybe it&#8217;s not so far-fetched that someone would use \u0008 in a document.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Michael S Collins</title>
		<link>http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html#comment-1742</link>
		<dc:creator>Michael S Collins</dc:creator>
		<pubDate>Wed, 26 Mar 2008 01:56:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html#comment-1742</guid>
		<description>Rob,&lt;br/&gt;Once again you&#039;ve done a great job of highlighting the &lt;i&gt;technical&lt;/i&gt; deficencies of OOXML.  This is so much more effective than the pro/anti OOXML zealotry that is pervasive in both camps. The holes in the DIS 29500 spec that you&#039;ve brought out in the open ought to encourage all the NBs to disapprove this proposed standard until such time as MS/ECMA fix these glaring flaws. &lt;br/&gt;I used to wonder why MS was not content to let OOXML stand or fall on its technical merits - no more! Without the lobbying, ballot-stuffing, and subverting of the ISO processes, this &quot;standard&quot; wouldn&#039;t have made it out of Redmond. &lt;br/&gt;Keep up the good work!&lt;br/&gt;-MC</description>
		<content:encoded><![CDATA[<p>Rob,<br />Once again you&#8217;ve done a great job of highlighting the <i>technical</i> deficencies of OOXML.  This is so much more effective than the pro/anti OOXML zealotry that is pervasive in both camps. The holes in the DIS 29500 spec that you&#8217;ve brought out in the open ought to encourage all the NBs to disapprove this proposed standard until such time as MS/ECMA fix these glaring flaws. <br />I used to wonder why MS was not content to let OOXML stand or fall on its technical merits &#8211; no more! Without the lobbying, ballot-stuffing, and subverting of the ISO processes, this &#8220;standard&#8221; wouldn&#8217;t have made it out of Redmond. <br />Keep up the good work!<br />-MC</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Anonymous</title>
		<link>http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html#comment-1741</link>
		<dc:creator>Anonymous</dc:creator>
		<pubDate>Tue, 25 Mar 2008 23:28:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html#comment-1741</guid>
		<description>MS is trying to outdo Houdini in escape artistry.  OOXML should just be binary blobs between an open and close tag and spare us this nonsense.  They ARE blobs as this post shows.</description>
		<content:encoded><![CDATA[<p>MS is trying to outdo Houdini in escape artistry.  OOXML should just be binary blobs between an open and close tag and spare us this nonsense.  They ARE blobs as this post shows.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: billposer</title>
		<link>http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html#comment-1740</link>
		<dc:creator>billposer</dc:creator>
		<pubDate>Tue, 25 Mar 2008 20:41:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html#comment-1740</guid>
		<description>This is &lt;b&gt;soooo&lt;/b&gt; typical of Microsoft. Having become somewhat of a collector of ascii escapes through the continued expansion of my programs uni2ascii/ascii2uni, which currently support 29 escape mechanisms, I thought I had seen them all. I was wrong. Microsoft has come up with yet another non-standard escape! They couldn&#039;t use U+XXXX like the Unicode consortium, or &#xXXXX; as in HTML, or \uXXXX as in several programming languages? They just had to add yet another one?!</description>
		<content:encoded><![CDATA[<p>This is <b>soooo</b> typical of Microsoft. Having become somewhat of a collector of ascii escapes through the continued expansion of my programs uni2ascii/ascii2uni, which currently support 29 escape mechanisms, I thought I had seen them all. I was wrong. Microsoft has come up with yet another non-standard escape! They couldn&#8217;t use U+XXXX like the Unicode consortium, or &#xXXXX; as in HTML, or \uXXXX as in several programming languages? They just had to add yet another one?!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Vexorian</title>
		<link>http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html#comment-1739</link>
		<dc:creator>Vexorian</dc:creator>
		<pubDate>Tue, 25 Mar 2008 15:58:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html#comment-1739</guid>
		<description>I thought OOXML stood for &quot;optionally open XML&quot; but it looks to me it actually is a recursive acronym:&lt;br/&gt;&lt;br/&gt;OOXML Obviously ain&#039;t XML</description>
		<content:encoded><![CDATA[<p>I thought OOXML stood for &#8220;optionally open XML&#8221; but it looks to me it actually is a recursive acronym:</p>
<p>OOXML Obviously ain&#8217;t XML</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Anonymous</title>
		<link>http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html#comment-1735</link>
		<dc:creator>Anonymous</dc:creator>
		<pubDate>Tue, 25 Mar 2008 01:43:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html#comment-1735</guid>
		<description>&lt;i&gt;OOXML also defines two additional types, “lptsr” (7.4.2.8) and “bstr” (7.4.2.4)&lt;/i&gt;&lt;br/&gt;&lt;br/&gt;I never read the OOXML documentation, but the &lt;i&gt;name&lt;/i&gt; of these types gave me a bad feeling. Aren&#039;t LPSTR and BSTR two Windows API string types? (BSTR being the COM UTF-16 counted string type and LPSTR being the C 8-bit &quot;ANSI&quot; zero-terminated string type.)&lt;br/&gt;&lt;br/&gt;It wouldn&#039;t surprise me if the source of the issues with these two string types is due to them attempting to serialize their native representation. In particular, BSTR can have embedded NULLs. Either way, it sounds like an implementation detail &quot;leaking&quot; into the specification.</description>
		<content:encoded><![CDATA[<p><i>OOXML also defines two additional types, “lptsr” (7.4.2.8) and “bstr” (7.4.2.4)</i></p>
<p>I never read the OOXML documentation, but the <i>name</i> of these types gave me a bad feeling. Aren&#8217;t LPSTR and BSTR two Windows API string types? (BSTR being the COM UTF-16 counted string type and LPSTR being the C 8-bit &#8220;ANSI&#8221; zero-terminated string type.)</p>
<p>It wouldn&#8217;t surprise me if the source of the issues with these two string types is due to them attempting to serialize their native representation. In particular, BSTR can have embedded NULLs. Either way, it sounds like an implementation detail &#8220;leaking&#8221; into the specification.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Rob</title>
		<link>http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html#comment-1733</link>
		<dc:creator>Rob</dc:creator>
		<pubDate>Mon, 24 Mar 2008 21:50:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html#comment-1733</guid>
		<description>@steve,&lt;br/&gt;&lt;br/&gt;I raised a similar issue in the US NB review of OOXML last summer.  We did register one comment, US-0162, that pointed out that the &quot;bstr&quot; type lacked a mechanism to &quot;escape the escape&quot;, i.e., encode a literal value  of _x0008_.  This was addressed by Ecma Response 118 at the BRM.  But it  left untouched the other types with the same issue, like ST_Xstring and lpstr.  &lt;br/&gt;&lt;br/&gt;I would have reported ST_Xstring as a problem during the initial review, but we ran out of time,</description>
		<content:encoded><![CDATA[<p>@steve,</p>
<p>I raised a similar issue in the US NB review of OOXML last summer.  We did register one comment, US-0162, that pointed out that the &#8220;bstr&#8221; type lacked a mechanism to &#8220;escape the escape&#8221;, i.e., encode a literal value  of _x0008_.  This was addressed by Ecma Response 118 at the BRM.  But it  left untouched the other types with the same issue, like ST_Xstring and lpstr.  </p>
<p>I would have reported ST_Xstring as a problem during the initial review, but we ran out of time,</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: steve_l</title>
		<link>http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html#comment-1731</link>
		<dc:creator>steve_l</dc:creator>
		<pubDate>Mon, 24 Mar 2008 21:27:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html#comment-1731</guid>
		<description>One problem with any escaping rule is how to handle double escapes, or when to unescape them. &lt;br/&gt;&lt;br/&gt;&lt;i&gt;Any XSL engine will still be able to handle escaped values in the text, just not unescape them&lt;/i&gt;. Sometimes that is good, sometimes it will be hopelessly bad. It depends entirely on the use.</description>
		<content:encoded><![CDATA[<p>One problem with any escaping rule is how to handle double escapes, or when to unescape them. </p>
<p><i>Any XSL engine will still be able to handle escaped values in the text, just not unescape them</i>. Sometimes that is good, sometimes it will be hopelessly bad. It depends entirely on the use.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Anonymous</title>
		<link>http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html#comment-1730</link>
		<dc:creator>Anonymous</dc:creator>
		<pubDate>Mon, 24 Mar 2008 21:12:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html#comment-1730</guid>
		<description>4. define entities for OOXML like &backspace; etc&lt;br/&gt;&lt;br/&gt;Then again, aren’t we beyond the stage of improving this POS?&lt;br/&gt;&lt;br/&gt;You’d think the combined might of Microsoft and ECMA would be able to produce something with fewer errors than normal specs, not the opposite…&lt;br/&gt;&lt;br/&gt;jd</description>
		<content:encoded><![CDATA[<p>4. define entities for OOXML like &backspace; etc</p>
<p>Then again, aren’t we beyond the stage of improving this POS?</p>
<p>You’d think the combined might of Microsoft and ECMA would be able to produce something with fewer errors than normal specs, not the opposite…</p>
<p>jd</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Anonymous</title>
		<link>http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html#comment-1729</link>
		<dc:creator>Anonymous</dc:creator>
		<pubDate>Mon, 24 Mar 2008 20:58:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.robweir.com/blog/2008/03/ooxmls-out-of-control-characters.html#comment-1729</guid>
		<description>&lt;i&gt;By corrupting XML string values in the way that it does, DIS 29500 breaks the ability to have loosely coupled systems. Once the value space is polluted by these aberrant control characters, every application, every process that touches this data must be aware of their non-standard idiosyncrasies lest they crash or return incorrect answers.&lt;/i&gt;&lt;br/&gt;&lt;br/&gt;That&#039;s not a bug, Rob, it&#039;s a feature.</description>
		<content:encoded><![CDATA[<p><i>By corrupting XML string values in the way that it does, DIS 29500 breaks the ability to have loosely coupled systems. Once the value space is polluted by these aberrant control characters, every application, every process that touches this data must be aware of their non-standard idiosyncrasies lest they crash or return incorrect answers.</i></p>
<p>That&#8217;s not a bug, Rob, it&#8217;s a feature.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

<!-- Dynamic Page Served (once) in 0.526 seconds -->
