• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

An Antic Disposition

  • Home
  • About
  • Archives
  • Writings
  • Links
You are here: Home / 2011 / Archives for May 2011

Archives for May 2011

Gwenell Doc: A Small and Fast ODF Text Editor

2011/05/26 By Rob 9 Comments

 

Today I look at Gwennel Doc, an ODF-based text editor for Microsoft Windows.  In interesting attribute of Gwennel is its small size and fast speed.  It can load and display the 792 page ODF 1.2, Part 1 specification in around 2 seconds, using an executable that is around 1/4 the size of that document.  Something interesting is going on here that needs investigation.  I contacted the author of Gwennel Doc, Marc Kerbiquet, who consented to the following email interview.  Enjoy!

Could you tell us a little bit about yourself, where you live and what you do for work?  Are you a professional programmer?  Or a hobbyist?

I live in France, I work as a professional programmer to pay the bills and I write programs as a hobbyist. Gwennel Doc is a hobby program.

What got you interested in writing a text editor?  How did you pick the name “Gwennel”?

I wrote first a folding text editors for programmers (Code Browser), then I wrote an ODF viewer (Woodrat Reader), so an ODF editor was a natural continuation to this :-)

The initial goal was to make a folding/outlining editor like Code Browser with rich text. But it would have required an hybrid format to handle folding directives.

“Gwennel” means “swallow” in the Breton language, a small and fast bird. Breton is a language spoken in Brittany, a region in the north-west of France.

You call your tool a “WYSIWYM”  (What You See is What you Mean) editor.  How is this different than other editors, and how does Gwennel Doc support this style of editing?

It is different from all the lightweight rich editors that edit RTF  documents or equivalent formats because it supports styles. Styles allow to separate the presentation and content: you can tag a word as “menu item” or “keyword” instead of “Bold” or “Color-Red” and change later how it should be displayed.

Word and OpenOffice allow WYSIWYM but they promote the WYSIWYG (What You See is What You Get) paradigm.

As your website describes, your intent was to make a text editor, not a full word processor.  How do you define the boundary between these two?  What features did you decide to omit?

The goal of a word processor is to produce a printed document. The paper is an important aspect:

  • the page format
  • the header and footer
  • how paragraphs and tables must be splitted when the end of page is reached
  • footnotes
  • the table of content
  • the index

Gwennel Doc is more a note taking software intended to on-screen reading, so it does not have to deal with all these features.

Printing command is not implemented yet but it will be very basic.

What made you choose ODF as a document format?

  • I already worked with ODF before (Woodrat Reader)
  • I didn’t want to create a new format.
  • As far as I know, there is no other open standard designed for edition and supporting styles:
    • RTF: no support of styles
    • HTML + CSS: not intended for edition
    • OOXML: just a political standard, too complicated anyway
  • Interoperability, even if limited:
    • it can read partially documents from other word processors (unsupported elements are just ignored)
    • Use OpenOffice for all missing features (print, export as PDF, …)
    • Gwennel Doc documents can be read without Gwennel Doc
    •  

How hard was it to support ODF in Gwennel? What was the hardest part?

Gwennel was designed from start to work with ODF, so the application model, apart from the table styles, fits very well with ODF. Easy to load, easy to save.

A difficult part was to understand the ODF specification, but I’ve already done it when writing Woodrat Reader.

The hardest part was to find a solution to implement table styles in Gwennel as the ODF has no support for table styles. I’ve finally found a solution to keep a compatibility with ODF and to keep a minimum of interoperability.  Unfortunately table styles are lost when a Gwennel document is edited with OpenOffice.

One thing that strikes the user is how small and fast Gwennel is.  It is less than 150KB in size and requires no install.   Compared to other word processors, this is amazingly fast.  How did you accomplish this? Can give some details on your approach, such as what programming language you used, what ZIP and XML libraries you used, etc.  What is the secret to making a small, fast editor?

I really take care a lot on speed. Almost everything should be instantaneous on a computer that can execute billions of instructions per seconds.

Now, here is the secret :-)

First, the executable is 270K big, not 150K

I cheated, I’ve used UPX to compress the executable.

For curious people, here is the detail of what’s inside:

  • 50K – the zip library (zlib)
  • 20K – the XML parser (AsmXML)
  • 40K – core library (memory management, strings, lists, GUI layer)
  • 70K – the rich edit component
  • 24K – the (partial) ODT schema
  • 64K – the main application code and resources (text, menu, icons)

The operating system (Windows XP and better) provides all the remaining stuff (GDI for font and image rendering, GDI+ for image manipulation, …).

There is even unused code that I could remove to save few kilobytes.

On the other side, I plan to add a lot of pictures to better show the role of properties in styles, it should increase the size of the executable by 100K.

The program is written entirely in assembly (except for the zlib library)

I would be too long to explain why, but I didn’t choose assembly for speed (it could seem crazy as any programmer would say that the only reason to use assembly is for speed), Woodrat Reader is a bit faster than Gwennel Doc to load a document and it is written in C++.

The most visible benefit of assembly is the size of the application, not the speed.

I use common optimization techniques:

  • choosing the right algorithm and data model in the critical parts (e.g. a hash table instead of a simple list)
  • optimizing access to memory
  • caching data (to save computation)

I don’t think that Gwennel is fast; It’s the other word processors that are slow

Some reasons are common to software bloat found in most software:

  • long history of development
  • marketing considerations (spend more time to develop new features rather than optimizing)

Other reasons are more specific to word processing as word processors support a lot more features than Gwennel, for instance:

  • font kerning makes the computation of the layout more complicated,
  • computing the paging in realtime requires a lot of CPU (Gwennel uses just one infinite-length page)

Gwennel is written for modern machines with modern OS, the font and image rendering is entirely done by the operating system, so it can take advantage of hardware acceleration. But on the other side, it is limited to the capabilities of the system library and some features cannot be implemented (e.g. no control on spacing between characters or no outline effect).

Loading is fast but it could be faster: The time to load the file, unzip it, parse the XML and build the model is almost immediate even with big documents (1000 pages), most of the time is spent to layout the text by asking Windows the width and the height of each word. Windows is very good for this but it could be optimized: for instance it shouldn’t be necessary to make a system call for each “the” word in Times New Roman, 12pt of the document because the result will be always the same.

Do you have any future plans for Gwennel?

The future plan for Gwennel Doc is to make it a ‘finished’ application:

  • a Print command,
  • a Find command,
  • and minor goodies one can expect such as opening recent documents.

There is no plan to support more elements of the ODF but the compliance has yet to be improved (online ODF validators are not very happy with documents created by Gwennel).

  • Tweet

Filed Under: ODF

PJ, Goodbye and Good Luck

2011/05/10 By Rob 7 Comments

There was a time when daggers were drawn on Linux and its demise was plotted in dark detail.  At that hour stepped out a shieldmaiden with a blog, and that blog was Groklaw.   Eight years later, we hear the news that Groklaw will cease new postings after May 16th.  My sadness in hearing this news is more than equaled by my gratitude to PJ and her community of researchers and commentators, for their enormous effort and unparalleled achievement over these years.   The world is a better place because of PJ.  Who can hope to say better?

As a retrospective of a different kind, I’ve taken the titles from every Groklaw article since its start and created a “word cloud” from them, using Wordle.  This shows, at a glance, the issues that have dominated the attention of Groklaw over the years.

  • Tweet

Filed Under: Open Source

Ten Things You Didn’t Know About ODF 1.2

2011/05/05 By Rob 6 Comments

Some little known facts, all of them true, but only some of them amusing, and even then only just so, about ODF 1.2, recently approved as a Committee Specification by the OASIS ODF TC:

  1. In producing OASIS ODF 1.2, we had 184 Technical Committee meetings, not including the numerous subcommittee meetings.
  2. During the development of ODF 1.2, the active TC membership grew by 78%.
  3. The ODF TC , during the ODF 1.2 work, had 76 members, from 17 countries, representing 23 companies or organizations, as well as 17 individual members.  The sun never sets on the ODF TC.
  4. ODF TC members received 14,655 emails from the TC’s email list while working on ODF 1.2, including 474 notes with a post-script (PS), 113 with a post-post-script (PPS) and 13 with a post-post-post-script (PPPS), suggesting a new phrase for derangement:  “going postscript”.
  5. ODF 1.2 has been out for public review a total of 210 days.
  6. The ODF TC resolved 1,822 public comments while working on ODF 1.2.  We read every one of them.
  7. ODF 1.2 says “shall” 628 times, but says “please” only 14 times, making it one of the most discourteous specifications around.
  8. ODF 1.2 has 72 external normative references and 16 external non-normative references.
  9. If you printed out all of ODF 1.2 and laid the pages end-to-end, it would be approximately 20% taller than the Eiffel Tower.  You would also probably be arrested.
  10. ODF 1.2’s OpenFormula knows how many imperial pints will fill a cubic light year.  But please, drink only in moderation.
  • Tweet

Filed Under: ODF

Primary Sidebar

Copyright © 2006-2023 Rob Weir · Site Policies

 

Loading Comments...