1. Epubbing

What is an epubber? Continuum, creative control

2. Your Book

Editing, elements, experience and friends

3. Pathways

Choosing a pathway, retailers, distributors, strategies

4. Formats

File formats, epubs, Linux and LibreOffice

5. Structure

Front, body, end matter for ebooks

6. Covers

Creating a cover, channel requirements

7. Details

ISBNs, payments, tax issues compared

8. Promotion

Promotion, pricing and possibilities

9. Resources

Epubbing Pathways table, antipodean issues, blogs, links

4. Ebook Formats

Alan Villiers
The main feature of ebook formats is that they are reflowable: they can cleanly change their size and shape according to how and with what they're being viewed, so they're suitable for reading documents on any devices – phones, tablets or computers.

Like any specialised jargon, the format names may sound daunting, but they aren't really. Formats vary from free open standards available to everyone, to proprietory versions used by vendors hoping they'll force customers to stay with them. Here's a guide to the bestiary.

.txt The first, simplest plain text format, a free open standard containing only ASCII or Unicode characters.
.html The language most websites are written in. It's a text markup language, e.g., to define "Chapter 5" as being a second-level heading, it would be written as <h2>Chapter 5</h2>. You can read an html file as plain text: it's a free open standard.
.epub The open standard format for electronic books, and it's simply html. An epub file is like a a small compressed (zipped) website, so like web pages it's reflowable and can contain images, style files, etc.
.mobi Derived from a format called Mobipocket, and (with proprietory changes) is the format Amazon uses for its Kindle ebooks and readers.
.doc, .docx Microsoft Word's proprietory format, widely used for documents in older versions of Word, the editing software used by most authors. Files in doc format are reflowable and easily converted into epub. DOCX is a more recent Word format. Some channel software will accept both, others only one or the other.
.rtf Rich Text Format, mainly used by Word documents, and is easily edited, reflowable, and acceptable to some channels.
.odt The format for LibreOffice, which emulates Microsoft Word but is free and available for all platforms. LibreOffice files can be easily edited and written out to html, doc, docx, rtf etc. format. Hence, like Word, it can be used by authors to generate whatever they need.
.pdf The widely-available proprietory Portable Document Format from Adobe, very common for documents designed to reproduce fixed-layout pages, so it is more often used for print books rather than ebooks.
.ibook Apple's version of epub with some proprietory variations, making it (of course) incompatible with the epub open standard.
Which Input Format?

The output format for almost all ebooks is .epub (except for Amazon's .mobi and Apple's .ibook). But the acceptable input files for conversion to ebook vary widely: here's a summary. Some also accept .txt but correct formatting is difficult.

Channel Input file formats accepted
KDP html doc docx epub rtf mobi *pdf
iBooks doc epub
Nook html doc docx epub
Kobo doc docx mobi odt
Pronoun docx epub
Booktango doc docx epub rtf
Smashwords doc *epub rtf
Draft2Digital doc docx epub
Lulu doc docx rtf odt
eBookIT doc docx epub rtf
BookBaby doc docx *pdf
IngramSpark epub
Comments *limitations *limitations
Writing EPUB Files

For most authors, Word or LibreOffice can generate all the input file formats necessary. The exception is epub, and to create epub files directly you need an epub editor/converter. Here are some possibilities:
  • Sigil is a well-respected program for creating epubs from doc/txt files. It is free and works on all platforms.
  • Calibre is free software for reading epub and mobi files on all platforms. It can also generate epubs, with some limitations.
  • Jutoh is an ebook editor, convertor and creator for Mac, Windows and Linux. Costs about US$40.
  • Scrivener is a word processor and overall project management tool. It works on Windows and Mac systems and can write files in most ebook formats. Costs about US$40.
  • PressBooks is online book writing software that creates files in most ebook formats. Costs US$20-$100.

After creating any epub file you should always test its validity with the IDPF epub validator, and check how it looks in Calibre or an ereading device.

For Linux or LibreOffice Users

Amazon's KDP prefers to receive files in html, though it will accept doc and other formats. KDP provides reasonably good documentation on how to create a clean html file from a doc file, which is necessary because Word documents often accrete a lot of irrelevant formatting (making file-saving very slow for instance). So KDP recommends saving Word docs to 'Filtered' html.

But for people using LibreOffice to write doc files, the filtered option is not available. In that case a great program called Word to Clean HTML is available online: it's well worth the small donation cost to be able to generate clean html files for KDP.

Other helpful LibreOffice and Linux links are Notjohn's KDP guide plus Linux and LibreOffice tips, Smashwords formatting using LibreOffice, and a series for Linux users on ebook formatting.

Where to Now?

The next step is to look at the structure a manuscript needs to be suitable for epubbing. Go to 5. Structure.