{"id":178,"date":"2008-06-26T07:50:00","date_gmt":"2008-06-26T11:50:00","guid":{"rendered":"http:\/\/2d823b65bb.nxcli.io\/2008\/06\/beautiful-word-clouds.html"},"modified":"2016-10-03T18:34:20","modified_gmt":"2016-10-03T22:34:20","slug":"beautiful-word-clouds","status":"publish","type":"post","link":"https:\/\/www.robweir.com\/blog\/2008\/06\/beautiful-word-clouds.html","title":{"rendered":"Beautiful Word Clouds"},"content":{"rendered":"<p>We&#8217;ve all seen <a href=\"http:\/\/en.wikipedia.org\/wiki\/Tag_clouds\">tag clouds<\/a> by now, the visualization technique that shows the importance (however defined, but typically by prevalence) of a word by assigning a proportionately sized font.<\/p>\n<p>But now comes along a tool that treats these clouds as art. <a href=\"http:\/\/wordle.net\/\">Wordle&#8217;s<\/a> &#8220;Beautiful Word Clouds&#8221; is quite addictive, allowing you to enter the raw text and then play around with layout algorithms, fonts and coloring schemes to produce some very nice looking clouds. The author \u2014 <a href=\"http:\/\/blog.wordle.net\/\">Jonathan Feinberg <\/a>\u2014 works here at IBM, a fact I did not discover until I had already wasted hours playing with the tool. So maybe I can count this as work now?<\/p>\n<p>Here are a few examples of word clouds formed by analyzing three different texts. Can you guess the identity of the three texts?<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/2d823b65bb.nxcli.io\/blog\/images\/moby_cloud.png\" alt=\"\" \/><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/2d823b65bb.nxcli.io\/blog\/images\/sonnets_cloud.png\" alt=\"\" \/><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/2d823b65bb.nxcli.io\/blog\/images\/rob_cloud.png\" alt=\"\" \/><\/p>\n<p>Some of my wish-list items are:<\/p>\n<ul>\n<li>Apply a <a class=\"zem_slink\" title=\"Stemming\" href=\"http:\/\/en.wikipedia.org\/wiki\/Stemming\" rel=\"wikipedia\">stemming<\/a> algorithm to conflate words with the same root. So in the last example, &#8220;standard&#8221; and &#8220;standards&#8221; are counted separately, when they are probably best counted as the same word.<\/li>\n<li>Auto generate an image map associated with the cloud<\/li>\n<li>Export to PNG (even if just written temporarily to server, I can download it from there)<\/li>\n<li>I&#8217;d love to read a paper on how the layout algorithms works<\/li>\n<li>What would happen if you combined Kohonen <a class=\"zem_slink\" title=\"Self-organizing map\" href=\"http:\/\/en.wikipedia.org\/wiki\/Self-organizing_map\" rel=\"wikipedia\">self-organizing maps<\/a> with word clouds? Arrange the words so their proximity in the cloud was correlated with co-occurrence in the text.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>We&#8217;ve all seen tag clouds by now, the visualization technique that shows the importance (however defined, but typically by prevalence) of a word by assigning a proportionately sized font. But now comes along a tool that treats these clouds as art. Wordle&#8217;s &#8220;Beautiful Word Clouds&#8221; is quite addictive, allowing you to enter the raw text [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_genesis_hide_title":false,"_genesis_hide_breadcrumbs":false,"_genesis_hide_singular_image":false,"_genesis_hide_footer_widgets":false,"_genesis_custom_body_class":"","_genesis_custom_post_class":"","_genesis_layout":"","footnotes":""},"categories":[155,198],"tags":[159,160,156,157,158],"class_list":{"0":"post-178","1":"post","2":"type-post","3":"status-publish","4":"format-standard","6":"category-language","7":"category-popular","8":"tag-moby-dick","9":"tag-shakespeare","10":"tag-tag-clouds","11":"tag-word-clouds","12":"tag-wordle","13":"entry"},"_links":{"self":[{"href":"https:\/\/www.robweir.com\/blog\/wp-json\/wp\/v2\/posts\/178","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.robweir.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.robweir.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.robweir.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.robweir.com\/blog\/wp-json\/wp\/v2\/comments?post=178"}],"version-history":[{"count":4,"href":"https:\/\/www.robweir.com\/blog\/wp-json\/wp\/v2\/posts\/178\/revisions"}],"predecessor-version":[{"id":2507,"href":"https:\/\/www.robweir.com\/blog\/wp-json\/wp\/v2\/posts\/178\/revisions\/2507"}],"wp:attachment":[{"href":"https:\/\/www.robweir.com\/blog\/wp-json\/wp\/v2\/media?parent=178"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.robweir.com\/blog\/wp-json\/wp\/v2\/categories?post=178"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.robweir.com\/blog\/wp-json\/wp\/v2\/tags?post=178"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}