Today I look at Gwennel Doc, an ODF-based text editor for Microsoft Windows. In interesting attribute of Gwennel is its small size and fast speed. It can load and display the 792 page ODF 1.2, Part 1 specification in around 2 seconds, using an executable that is around 1/4 the size of that document. Something interesting is going on here that needs investigation. I contacted the author of Gwennel Doc, Marc Kerbiquet, who consented to the following email interview. Enjoy!
Could you tell us a little bit about yourself, where you live and what you do for work? Are you a professional programmer? Or a hobbyist?
I live in France, I work as a professional programmer to pay the bills and I write programs as a hobbyist. Gwennel Doc is a hobby program.
What got you interested in writing a text editor? How did you pick the name “Gwennel”?
I wrote first a folding text editors for programmers (Code Browser), then I wrote an ODF viewer (Woodrat Reader), so an ODF editor was a natural continuation to this :-)
The initial goal was to make a folding/outlining editor like Code Browser with rich text. But it would have required an hybrid format to handle folding directives.
“Gwennel” means “swallow” in the Breton language, a small and fast bird. Breton is a language spoken in Brittany, a region in the north-west of France.
You call your tool a “WYSIWYM” (What You See is What you Mean) editor. How is this different than other editors, and how does Gwennel Doc support this style of editing?
It is different from all the lightweight rich editors that edit RTF documents or equivalent formats because it supports styles. Styles allow to separate the presentation and content: you can tag a word as “menu item” or “keyword” instead of “Bold” or “Color-Red” and change later how it should be displayed.
Word and OpenOffice allow WYSIWYM but they promote the WYSIWYG (What You See is What You Get) paradigm.
As your website describes, your intent was to make a text editor, not a full word processor. How do you define the boundary between these two? What features did you decide to omit?
The goal of a word processor is to produce a printed document. The paper is an important aspect:
- the page format
- the header and footer
- how paragraphs and tables must be splitted when the end of page is reached
- footnotes
- the table of content
- the index
Gwennel Doc is more a note taking software intended to on-screen reading, so it does not have to deal with all these features.
Printing command is not implemented yet but it will be very basic.
What made you choose ODF as a document format?
- I already worked with ODF before (Woodrat Reader)
- I didn’t want to create a new format.
- As far as I know, there is no other open standard designed for edition and supporting styles:
- RTF: no support of styles
- HTML + CSS: not intended for edition
- OOXML: just a political standard, too complicated anyway
- Interoperability, even if limited:
- it can read partially documents from other word processors (unsupported elements are just ignored)
- Use OpenOffice for all missing features (print, export as PDF, …)
- Gwennel Doc documents can be read without Gwennel Doc
How hard was it to support ODF in Gwennel? What was the hardest part?
Gwennel was designed from start to work with ODF, so the application model, apart from the table styles, fits very well with ODF. Easy to load, easy to save.
A difficult part was to understand the ODF specification, but I’ve already done it when writing Woodrat Reader.
The hardest part was to find a solution to implement table styles in Gwennel as the ODF has no support for table styles. I’ve finally found a solution to keep a compatibility with ODF and to keep a minimum of interoperability. Unfortunately table styles are lost when a Gwennel document is edited with OpenOffice.
One thing that strikes the user is how small and fast Gwennel is. It is less than 150KB in size and requires no install. Compared to other word processors, this is amazingly fast. How did you accomplish this? Can give some details on your approach, such as what programming language you used, what ZIP and XML libraries you used, etc. What is the secret to making a small, fast editor?
I really take care a lot on speed. Almost everything should be instantaneous on a computer that can execute billions of instructions per seconds.
Now, here is the secret :-)
First, the executable is 270K big, not 150K
I cheated, I’ve used UPX to compress the executable.
For curious people, here is the detail of what’s inside:
- 50K – the zip library (zlib)
- 20K – the XML parser (AsmXML)
- 40K – core library (memory management, strings, lists, GUI layer)
- 70K – the rich edit component
- 24K – the (partial) ODT schema
- 64K – the main application code and resources (text, menu, icons)
The operating system (Windows XP and better) provides all the remaining stuff (GDI for font and image rendering, GDI+ for image manipulation, …).
There is even unused code that I could remove to save few kilobytes.
On the other side, I plan to add a lot of pictures to better show the role of properties in styles, it should increase the size of the executable by 100K.
The program is written entirely in assembly (except for the zlib library)
I would be too long to explain why, but I didn’t choose assembly for speed (it could seem crazy as any programmer would say that the only reason to use assembly is for speed), Woodrat Reader is a bit faster than Gwennel Doc to load a document and it is written in C++.
The most visible benefit of assembly is the size of the application, not the speed.
I use common optimization techniques:
- choosing the right algorithm and data model in the critical parts (e.g. a hash table instead of a simple list)
- optimizing access to memory
- caching data (to save computation)
I don’t think that Gwennel is fast; It’s the other word processors that are slow
Some reasons are common to software bloat found in most software:
- long history of development
- marketing considerations (spend more time to develop new features rather than optimizing)
Other reasons are more specific to word processing as word processors support a lot more features than Gwennel, for instance:
- font kerning makes the computation of the layout more complicated,
- computing the paging in realtime requires a lot of CPU (Gwennel uses just one infinite-length page)
Gwennel is written for modern machines with modern OS, the font and image rendering is entirely done by the operating system, so it can take advantage of hardware acceleration. But on the other side, it is limited to the capabilities of the system library and some features cannot be implemented (e.g. no control on spacing between characters or no outline effect).
Loading is fast but it could be faster: The time to load the file, unzip it, parse the XML and build the model is almost immediate even with big documents (1000 pages), most of the time is spent to layout the text by asking Windows the width and the height of each word. Windows is very good for this but it could be optimized: for instance it shouldn’t be necessary to make a system call for each “the” word in Times New Roman, 12pt of the document because the result will be always the same.
Do you have any future plans for Gwennel?
The future plan for Gwennel Doc is to make it a ‘finished’ application:
- a Print command,
- a Find command,
- and minor goodies one can expect such as opening recent documents.
There is no plan to support more elements of the ODF but the compliance has yet to be improved (online ODF validators are not very happy with documents created by Gwennel).