Take pride in your eBook formatting (Part III)
This is the third installment of a series of articles. To read the previous one, please click here
The Road to Right
After having spent a lot of time in my last installment, telling you how you should not create an eBook, I will no longer hold you back with explanations of Wrong and instead we will point our heads forward and look down the road of Right. Let’s start with a quick overview over the process I am proposing just so you get a general idea for what you’re going to get yourself into. Depending on your level of expertise this might or might not be all that intimidating at first, but let me assure you that there is no magic involved and every tasked can be performed by virtually anyone familiar with a computer. Remember, the key lies, as so often, in getting the right tools or the job and putting them to work for you.
The majority of ebook formats in use today are nothing more than a packaged collection of HTML files. Yes, the same kind of files used to create and display web pages. Surprised? You shouldn’t be. It actually makes a lot of sense. HTML has been created to allow information display on a wide variety of display devices regardless of their capabilities. Whether your computer monitor has a high or a low resolution, whether you are running your browser fullscreen or in a small window, on an old or a new computer, basic HTML pages will always be able to display properly in all these environments.
Since we don’t know what device or software the reader will use when they want to display our eBooks, it only makes sense to utilize a format that is tweaked for that very purpose, doesn’t it? A format that has free text reflow capabilities and can easily embed images and other media. You might recall how I told you that you can actually embed video in your eBooks if you want to, and now you know, why.
HTML is a format perfectly suitable for the needs of the eBook community and all it really lacks is digital rights management, or copy protection to put it in plain old English. To accommodate that, some of the eBook formats are encrypted internally, but that is really none of our concern at this point. Let other people worry about that. We just want to package our book in a digital format that can be used by eReaders for the time being.
Among recording musicians we have a saying that is very suitable for our cause: Garbage in, garbage out! It means that when the source you are recording is garbage, your end result will inevitably be garbage also. There is just no way to make a bad source signal good. The synthesized vocals of current-generation pop stars are living proof of that.
Since we know that our end result is going to be an HTML file, the best way to avoid garbage along the way is to choose a source format that is as close to the output format as possible. So, if the output is HTML, why not make the source format HTML also? HTML is a very simple markup language that is so basic and, more importantly, widely document today that anyone can pick up the basics in under 30 minutes. In fact, many of you may already be familiar with the general basics of mark-up languages from styling their message board posts or maybe even creating their own web pages and blog posts.
To put it very bluntly. If we create an HTML file as the source for our eBooks, the end result will be every bit as reliable as the HTML file we initially created. Makes sense, doesn’t it? And that is really all there is to it. That is the secret to creating professionally-looking eBooks. You take the contents of your book and prepare them as a first rate HTML file and run it through a packaging software to prepare the final eBook for you. Yes, it really is THAT simple!
I would be remiss, however, to leave things at that. I promised to show you exactly how to do it, and I will. To make sure you are not getting stressed out at this point, let me repeat our key mantra once again.
The right tools are critical for an easy workflow.
Get the right tools for the job and you’ll be pitching a home run in no time. You will be a much happier human being and you will have much more time on your hands to enjoy other things in life. With that in mind, let me run you through some of the basic tools we will peruse in the next installments; tools that will help us achieve the perfect eBook formatting we so desire.
I don’t know about you, but I’m a Mac-head. I have long ago decided that my time is too precious to waste on computers and operating systems that don’t work properly and turn into utter time sinks. As a result, I am an Apple Mac user, plain and simple. As I said. Get the right tool for the job.
While I highly recommend you should use a Mac also – you will see you productivity go through the roof for one thing, I promise – this does not mean that you really need one. Everything we do on the following pages can be done on a Windows computer also, so do not worry.
At this point, let us assume that you have completed the manuscript for your book and have it entirely committed to a single word processor document. Needless to say, you will need a basic word processor to open, read and massage the file, but once again, I assume that as a writer, you do have that.
What you will also need, ideally, is a software called a Programming Editor. I use personally TextMate (http://macromates.com), but there are numerous other editors available on the internet also, which will serve the purpose just fine, some of them as paid software, others for free. JEdit (http://www.jedit.org), for example is a free programming editor that is available for Windows, Mac and Linux platforms and will definitely do you nicely.
In addition we will be using Calibre (http://calibre-ebook.com) for our final creation of the most common eBook formats. Calibre is a free software package that works under Windows and on the Mac.
In the next installment we will take a closer look at some of the features of HTML that we will need to whip our eBooks into shape and how they impact how we will create our eBook source file.
Part I • Part II • Part III • Part IV • Part V • Part VI • Part VII • Part VIII • Part IX
Also, don’t forget to check out my book Zen of eBook Formatting that is filled with tips, techniques and valuable information about the eBook formatting process.
For awhile, I was concerned that by not finishing in Word and doing my finished file in html, that I was cutting myself in the throat. Turns out, it’s the right thing to do.
I started out in summer 09 hand tweaking in html after stripping out extraneous files from the Word document — going by Joshua Tallent’s eBook. Very time consuming but, yes, I did end up with a very clean file that uploaded beautifully to Amazon dtp, thank you very much.
Looking forward to reading the rest of these blog posts and seeing how this story and process unfolds.
Your advice about avoiding word processors will make sense for most users, but if used correctly and rigorously, products like Word are very well designed to prepare the content for an eBook.
A word processor can be an excellent tool, but like any tool, the key is to use it properly — and improper use will inevitably lead to problems.
Your Word example in Part II made me cringe: anyone who turns off visibility of spaces, tabs, returns, etc. is NOT using Word effectively. The example did illustrate the issue, but missed the opportunity to make the point that proper use of styles is the correct method to manage formatting in Word. Unfortunately, Word has always been sold as being “easy to use” and 3rd party training usually avoids the useful stuff within it.
When all formatting for a book is managed with styles (and not with lazy direct formatting), it is fairly easy to convert a manuscript to HTML — or any other format you might need. Moreover, new versions of Word can export very clean HTML directly.
Having said that, I realize most users don’t have a clue about styles, and typically use only the most minimal set of Word’s features. My advice to authors who already have Word (or another similar WP product) would be to take the time to learn how to use the tool more effectively to prepare their work — and THEN use the very good tips here to massage it into the form needed for whatever eBook publisher is selected.
Eric, you are absolutely correct, but form my experience, even users who are familiar with the proper usage of word processors will run into problems because hardly anyone has the discipline necessary to stick with the proper styles etc. at all times. It is just too easy to introduce problem areas in a document without even noticing it.
In addition, I have not seen Word do any decent HTML export. It is not even remotely close to what a clean HTML file should/could look like, though it has become better than it used to be.
I always find it so much easier and faster to simply take a word processor file and turn it into a clean HTML file the way I describe it — at least for novel-style books there is nothing that can beat that. It is a process I do every day for the many client projects I handle, and it allows me to turn even the most messed up Word document in a clean eBook that behaves exactly as it should.
That has been my experience too, but I just wanted to point out that a word processor is the best tool for *preparing* the content — and if used properly, will probably be the best format to maintain the work. Word provides a number of “under-the-hood” tools to catch and correct inconsistencies: the blue formatting markers; list styles with paragraph and font formatting; outline view, etc.
Like you, I apply clean HTML via coding rather than use the Save As options. Word’s VBA is seldom suitable for casual users, but is ideal for re-purposing a document. I have a collection of code to convert documents to clean HTML in much the same way you describe — and all run directly from Word. (Some of your examples use regular expressions in an editor, but note that this capability is also available in Word via the “wildcard” options in Find and Replace.)
Novel-like documents are pretty straightforward because there are usually only a few styles to deal with. However, most of my work has been very complex, with tables, formulae, indexes — and usually multiple languages. Getting these kinds of documents to useful HTML is significantly more challenging!
Hey Guido, great series. I’m only through part III but I intend on reading then entire thing. I’m brand new to the ebook world and am looking forward to epublishing my first work of fiction after years of technical writing in the corporate world. As an “expert” in the use of Word for corporate documentation, it was quite an eye-opener to me to learn that it is not the ideal tool for ebook submission, thanks for the tip – it does make perfect sense when you think about it. I wish there were more resources like this available to those who are new to ebooks (like me). Thanks and keep up the good work.
First off- I’ve been looking for something like this for a long time now. It seems like the only ‘solutions’ one can find on the internet are people looking to make a quick buck for work that may not even be very good.
I do have one question related to your enthusiasm for the Mac – have you spent much time working with Linux on such a project? I have found Linux systems and software to be much less feature-bogged than comparable Windows programs. If so, I’d love to hear whatever particular insights you may have on that topic.
No, I have not much experience with Linux. I’m running an Ubuntu file server here and I am not overly fond of it. It works, but I cringe every time I have to log in an do something on it because all processes seem to be overly tedious and counter-intuitive.
I’m an editor, not a book designer, but a client asked me to convert even after I referred him elsewhere. Boy, was I sorry. I guess I went backwards. After I changed styles and set headings in Word (with the show formatting tool turned on to make sure the whole book was consistent), I let Calibre convert the rtf file. Neither section breaks nor page breaks made the ebook pages show on the next page. Then I used Sigil and tried about 5 ways of making the copyright page show separate from the title page, the chapters start on new pages, and the chapter titles (all h1) centered. I even labeled the files as Preface, Foreward, chapter…, etc. Every time I went back to Calibre, it removed whatever I had done in Sigil. Something does not compute. I can barely recognize html code, and my biggest triumph was making the book title bigger after Calibre downsized when converting the original rtf. I finally stopped letting Calibre do anything but read the book I made on Sigil. I got one with pages that turned where they should but the centered titles looked slightly too far right, and I’m guessing somehow the .2 paragraph indent for the rest of the text affected those even though I removed the indent in Word before I saved as rtf, converted via Calibre, and then opened in Sigil. I hope that and a blank page between two separate chapters are the only problems.
Well, what can I say? You should have followed my tutorial then you would not have any of these problems. 🙂
I’m old, this is scary. Thank you for your guidance.
Hi, nice tutorial, however there is one flaw as I see it. The HTML language is quite good for typesetting text (which is just fine for most writers). However what would best for math typesetting? Using pictures for formulas is not an answer for me. Lets say I have at least 5 formulas per page (but usually more), using pictures is slow, nor mention picture creation and managing.
Personally I use pdflatex which is based on TeX. In fact, best typesetting for all non-standard characters (math, music sheet, etc) is TeX for me. It generates the pdf or other formats well. I wonder whats your opinion on TeX.
Wow, thanks for this. It makes so many things more clear, especially about Word. I don’t actually use a word processor – I write in bbedit – but I was forced into a class in college where we learned about styles in Word, and they always seemed like the most tedious and time consuming way of getting anything done, but when I think of them being more like CSS, they make so much more sense.
I use the styles for Word and know how to make it clean. The problem for me is that each version of Word is different on different computers, and that makes it difficult to know how to do what I know I need to do because the versions are just different enough to confuse and frustrate me. Now I bought an Imac and bought Word for Mac, and it looks totally different – so much I am still writing my book on Windows 8, but that upgrade on Windows 8 has made it more difficult to use the newer version of Word that I bought for the new computer. So now I’m reading this trying to figure out what I should do. I also am constantly frustrated with Word. I save the book settings for the file, but I can’t seem to get those setting to default for future settings to work with future files. It just goes on and on. I know what needs to do be done, but I technically just don’t now how to do it.
Isn’t using Microsoft software on a Mac, kind of defeating the purpose of using a Mac in the first place? I’ve been using Apple Pages for many years and it has served me well universally. It costs a fraction of Word, is ten times faster and leaner and without the weird glitches that Word constantly exhibits.
I just wanted to let everyone know, that I have just published a book called “Zen of eBook Formatting” that is now available. It covers the aspects from this tutorial in a lot more detail and also adds a hole lot of additional info, details and advanced techniques to the mix. Here is a link with a bit more info, including a look at the Table of Contents of the book.
http://guidohenkel.com/2014/05/zen-of-ebook-formatting-is-now-available/
I just wanted to let you know that a revised, second edition of my book “Zen of eBook Formatting” is now available on Amazon. Unfortunately Amazon makes it a bit tricky for people who own the original to get the new one, but if you send them an email, it is my understanding that they will let you replace the version in your library with the updated one.
If you haven’t purchased the book yet, make sure to do so now. The new version has been adapted to current developments and expands on various subjects to clarify and to accommodate new developments in eBook devices.
Click here to grab the book on Amazon!