The Ebook Doctor — Part Three – Anatomy of an Ebook

When you know what an Ebook and E-reader really are, and how they work together, you can learn how to create files that will work properly for your customer/reader.

This post is split into two Sections. The first covers the principles and some useful tools for examining Ebooks. In the in the second Section we will take a detailed look inside the Ebook file.

In later posts we will get to the decision point on the format to use, the problems commonly found in Ebooks, and how to avoid them. And finally a step-by-step guide on how to get a good result.

Inside the Ebook?

Looking at the Anatomy of an Ebook will give you a feel for the overall structure of an Ebook and will give you a better perspective on why formatting your manuscript in very specific ways will help your file pass the import tests for Amazon, and other big book retailers — and make it readable on a much wider range of devices.


Simply put, the Ebook is a ‘packaged’ website, and an E-reader is a hand-held web browser.

The Ebook contains a series of connected web pages which can be displayed using a browser. The pages are stored inside a “Package” or “Container” file: MOBI/AZW (Kindle) or EPUB being the most common.


Although there are many Ebook formats out there, in practical terms there are only three that you need to think about when deciding on the format for your Ebook. These are EPUB, MOBI and PDF.

If your book contains mainly narrative text, and has only a small amount of formatting, then EPUB and MOBI are the most important formats for you. If you are producing a non-fiction book which has a more complex structure then a PDF files may be a better option for presenting the book.

MOBI is the core structure used by Amazon for their Kindle Ebook files. And, since Amazon is probably the biggest bookseller in the world, we all think of creating a MOBI/Kindle compatible file first. In reality, most professional designers will create an EPUB file first and then, when that file is ‘clean’, and passes all tests, it is converted into a MOBI file before uploading to Amazon. This makes the open EPUB structure the most important to look at in detail.

I will come back to the PDF format later, as a special case, because there are some difficulties when marketing PDFs through Amazon.


The EPUB package is really a ZIP file and if you change its extension “.epub” to “.zip”, the EPUB file becomes a true ZIP file which can be unZipped so that you can look inside and edit the files directly.

If you work on a MAC you can rename to “.sit” and you can extract in the usual way.

Unfortunately, you cannot just Zip the files again and change the extension back to .epub to re-create the EPUB file. Some of the files in the EPUB package cannot be compressed and you need some special ‘Tools’ to make the new file properly.

If you need to extract and re-compress EPUB files one great utility which does this is called eCanCrusher, from This is FREE to download and use and is a simple application designed to convert an EPUB folder into a compressed .epub file or vice versa. It needs no installation. To convert, you just drag/drop an EPUB folder or an .epub file onto the eCanCrusher application icon.

There is a version for both Windows and MAC at this link:


An Amazon MOBI files is a more complicated beast because there may be multiple formats inside the same ‘Package’. Amazon’s compilers will add the original source file, usually an EPUB, to the database. If you have worked with both EPUB and MOBI files you may have noticed that a Kindle book can be quite large compared to the EPUB file of the same book.

Cracking open a Kindle file is more complex than working with an EPUB but if you need to do so you can use KindleUnpack (formerly MobiUnpack) to unpack and inspect the contents of DRM-free Kindle Books or MOBI files. You can then modify the content as needed and rebuild the original with Kindlegen.

The program is Open Source and you can get it here…

And you can get more info here…

…but beware, disassembling and re-assembling a Kindle file may not get you back to an acceptable file. Also, if files contain DRM (Digital Rights Management) you will not be able to open them in the same way. You can find out more about the MOBI structure here…

As an aside – surveys of sales of files with and without DRM show very clearly that files WITHOUT DRM sell much better. The complexity of managing files with DRM is off-putting and many customers avoid them for this reason.

We will come back to the details of editing inside an Ebook package later but at this point it is best to just note that it is really better to get the structure and formatting set up correctly inside your Word Processor so you don’t have to crack open the final package later.

Testing the converted files is another very important step and if we go through a couple of cycles of “Convert file – Test – Correct original –Convert again – Test again” until we get the right result there should be no need to crack open the Ebook file. More on this later too.

Converting from Word Processor to Ebook

The purpose of the Ebook format is to display the book content so that it looks like pages in a traditional printed book. When we create the book content on a Word Processor, the file that is saved is not always in a format that can be displayed in a web browser (Depends on your word processor) so it needs to be converted into the language used to create web pages ‑ HTML / XML / XHTML.

When you convert a Word Processor file into an Ebook the conversion translates the text into individual XHTML files for each chapter, or section, and puts together a list of those files and a set of instructions to the E-reader to tell it the order in which to display the pages.

We also need to tell the E-reader the formatting to use when the text is displayed. Websites use ‘Cascading Style Sheets’ (CSS) to describe the way the type should look on the screen — the font, the position of text on the screen and whether it is bold or italic, the spacing before and after paragraphs, etc. CSS also controls the position of graphic elements like photos and illustrations. All the formatting that you impose on your book text has to be translated to CSS file if it is to look the way you want it to, so you need to take a lot of care to make sure that the way you format your manuscript inside your Word Processor is easy to translate accurately to CSS.

The important point here is not that you have to learn how to write HTML or to create a CSS file, but to know that if you format your original text in specific ways the conversion process will work seamlessly and your final Ebook will look like you want it to, and will pass the acceptance tests of the Ebook retailers.


The Epub standards have been around for a long time and most of the devices in the market are designed to use the EPUB Version 2 standard, first published in 2007. However a new standard has been available since 2011, EPUB Version 3. V3 is quite sophisticated and can use fonts in a different way from the earlier versions but most Ebooks are still produced using Version 2 because of the vast number of older E-readers still in the market that only support V2 and do not support all the functions built into V3.

In the next instalment we take a more detailed look inside the Ebook.

©DavidCronin 2015

Other posts in the series.