How to Format E-Books for the Kindle/Amazon Using HTML – Introduction

Boating
This is where I was sitting as I typed this blog

Okay, I know this is slightly off topic compared to my usual travel related stuff.  However, myself and a co-author spent almost a year researching and writing a travel guide, with the first release being for Kindle, and then followed by a hard copy version.  Well, writing the book wasn’t so hard.  It took time, in terms of researching facts, but even that was a lot of fun as we did several trips for the purpose of that research.  The hard (and most annoying bit) was actually converting an almost 300 page word file into something that would work on a Kindle, and any other device running the Kindle app.

I downloaded and even purchased books on formatting word documents for the Kindle.  I read through all the relevant help files and discussion forums on Amazon.  I then went through my Word file and applied all the advice that I had gleaned, and then hit the upload button.  Guess what?  A lot of things didn’t work.  Despite setting styles, and meticulously ensuring everything in my Word file was attributed to one of those styles, when I previewed the e-book, the formatting was off.  I created a table of contents with clickable links as recommended on Amazon’s free e-book guide, and of course, the links weren’t clickable in my e-book.

I was getting a bit frustrated now.  After ten months of writing what I considered to be an outstanding guide, it was now going to look less than professional as an e-book.  Well, I wasn’t going to give up.  From all the reading I had done, I understood that Kindle’s format was based on HTML (hyper text mark-up language – the language used to create all of your beloved web pages and blogs).  Therefore, the key was in getting Word to convert to as clean a version of HTML as possible.  I opened up the HTML file produced by Word (both the standard HTML file and the stripped down version).  Even the stripped down version contained a lot of unnecessary junk in it.  One option was to go through the stripped down version and clean out the unnecessary junk.  However, I didn’t even like the way Word was trying to force format everything.  Web pages and e-books are meant to be fluid, with minimal formatting, thus allowing the reader to have more flexibility with page/window size, increasing font sizes etc.

The solution… I decided it would be easier to just copy my Word file into a plain text editor and then manually add in the HTML code to create the formatting (remember what I said above: e-books and web pages should have minimal formatting in any event… especially e-books).  So in the end, that is what I did.  It took longer than simply pressing the save as button in Word, but the end result was a perfectly formatted e-book, with full working links, table of contents, and clickable/zoomable maps.

HTML – The Basics

The good thing is that HTML is a very simple code.  You don’t need any special programming skills to write HTML.  Especially for e-books, you just need to remember a few key commands.  First off, all HTML commands are contained within the greater than/less than characters, for example a command to make text bold… <b>.  The other thing you need to remember is that when you start a command, such as bold, you also need to close off that command, which is done like this </b>.  The forward slash indicates the end for that command.

The other thing to remember is the rule about “nesting”.  That is, when you have several commands, for example, bold and then italics, you need to close them off in the right order, with first command closed last.  For example “<b><i>Italic and Bold Text</i></b>”.

To write your code is very easy.  Just open up Notepad in Windows, or TextEdit on a Mac.  If you want a bit more convenience, I suggest downloading one of the more specialised plain text editors that support colour coding of HTML code, so it is easier to see (and easier to spot when you have forgotten to close off a command).  My favourite is Notepad++, available here: http://notepad-plus-plus.org/.

Starting Your HTML File for Kindle

To start your HTML file, open up your Word file, and then open up your plain text editor.  In your plain text editor, paste in the following:

<?xml version=”1.0″ encoding =”UTF-8″ ?>
<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.1//EN” “http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd”&gt;

<HTML xmlns=”http://www.w3.org/1999/xhtml&#8221; xml:lang=”en”>
<HEAD>

<meta http-equiv=”Content-Type” content=”application/xhtml+xml; charset=ISO-8859-1″/>
<TITLE>Insert Book Title Here</TITLE>

</HEAD>

<BODY>

Don’t worry too much about all of this code.  Most of it just ensures that Kindle knows what version of code you are subscribing to, and what unicode character set you have adopted.  You will not need to change anything here other than changing the text under “<TITLE>”, but even that isn’t necessary as Kindle doesn’t seem to do anything with it.

The essential features of starting a web page or Kindle document is that you need to identify that it is HTML code, hence the HTML tag in the third line.  After opening HTML, you then add the HEAD tag (fourth line).  HEAD is kind of like a header and contains information like your title and is also a useful spot to add any more complicated coding (which we will talk about in Part Two of this guide).

After we have finished with HEAD, we close it off, and then open up BODY (last line above).  Everything after BODY is the main content of your e-book.  Any text that you write after BODY (other than text within <>) will appear as text in your book.

The Basic HTML Commands for your E-Book

Okay, so we now have a text editor all set up and ready to go.  So what are the main formatting commands?  I will start with the basics so that you don’t get over-whelmed.  Note, there are plenty more that you would use for web pages, but the following are the essentials for e-books:

  • <P> – This stands for Paragraph.  You will use this for each paragraph of your e-book, and don’t forget to close off the paragraph with </P> at the end.
  • <B> – As mentioned above, this will create bold text.
  • <I> – As mentioned, this will create italic text.
  • <BR> – This forces a line break.  It’s normally not necessary to use BR because P already creates a line break between paragraphs.  Note, you do not close a <BR> command.
  • <U> – This will create underlined text.
  • <H1> – This is the Level 1 Heading style (similar to heading styles as used in Word).  Note, as with Paragraph, it will add a line break at the end.
  • <H2> – Level 2 Heading style.
  • <H3> – Level 3 Heading style.
  • <H4> – Level 4 Heading style.  There are more levels, but you shouldn’t need them in your e-book.  Also, after about the fourth level, the text starts getting smaller than your default paragraph text.
  • <MBP:PAGEBREAK/> – This is a special command created by Amazon that forces a page break.  It is the same as pressing CTRL-Enter in Word.  As with <BR>, you do not close a page break command.

Okay, so those are the commands.  I guess you would like to know how to use them.  Well, here is an example:

<H1>Chapter One</H1>

<H2>1.1 My Life as a Sponge</H2>

<P>It took me many years to realise I was a <B><I>sponge</I></B>.  I guess I always kind of suspected, but it was really only after I was taken out of my wrapper and used to scrub someone’s filthy body in the shower that I knew for sure.</P>

<P>Despite what you may believe, life as a sponge actually isn’t that bad.  I spend most of my days just hanging around on the soap rack.  I’ve made a lot of good friends there.  Palmolive in particular has a really good sense of humour, although Head and Shoulders can be a bit twisted.</P>

<MBP:PAGEBREAK/>

The above should be self-explanatory.  You will notice that I have a bit of space between the headings and paragraphs.  This is more for my benefit than anything else.  HTML code readers ignore all of the space and carriage returns, so put as much in as you want, to help you easily recognise different paragraphs and chapters.  Also, with the codes themselves, it doesn’t matter whether you use capitals or lower case.  I prefer all capitals for code because it makes them stand out more from the normal text of a book.

Slightly More Advanced Commands

Okay, this is where things get a bit more sexy.  There are two further commands that you will definitely want to use when creating e-books.

The first is the <IMG> command, which is used to display images.  As with <BR>, you do not close the <IMG> command.  However, you do need to put additional commands inside <IMG>, the most important being the file name of the image you wish to display.  Here is an example: <IMG SRC=”myphoto.jpg”>.

An important point here is that the image needs to be in the same directory as your HTML file.  You can create subdirectories and refer to those, but I suggest keeping it simple as there is less chance for error.  If you really do want a subdirectory (lets call it “Image”) then this is how you would do it <IMG SRC=”image/myphoto.jpg”>.

You can also define height and width (in pixels) of the photo as you want it displayed, although be careful with trying to force kindle readers to adopt fixed sizes as each kindle displays at a different resolution.  I will get into this a bit more in my next guide.

The next command that you will be interested in is creating links and bookmarks.  This is useful for both within-document linking (eg for table of contents) as well as having a link to an external website (such as in your bibliography).

The command that creates both a link and bookmark is Anchor, denoted as <A> </A> (yes you need to close off anchors).  As with images, you need to add additional commands within the <A> command to determine whether you want to create a link or a bookmark.  An example of a link is as follows: <A HREF=”http://wordpress.com”>Click on Me</A>.  HREF is short for hypertext reference, and is basically the address that you want the link to go to.  The words that appear after the command (ie “Click on Me”) is the text that the reader will actually see.  For links, it is actually good practice to just put in the name of the website so that the reader can see where they will be directed to if they click on the link.  Finally, after the clickable text, you will note that we have closed off the <A> Anchor command.  This is critical or else all of the remainder of your book potentially becomes one large clickable link (in blue underlined text).

To create a bookmark instead, find the section of your book that you want to act as a bookmark (for example, a chapter title), and then wrap the following command around it: <A NAME=”CH1″><H1>Chapter One – Life as a Sponge</H1></A>.  You will notice the command to create a bookmark is simply to use “NAME”.  Within the quotation marks, you can use whatever name you want.  Just remember exactly how you spelled it for when you create a hyperlink to it.  As a result, I tend to keep my bookmark names pretty simple, eg Ch1, Ch2 etc, and for sections within chapters, simply Ch1.1, Ch2.3, etc.  The bookmark itself will be invisible to the reader.

To link to your bookmark, you need to specify that the link is to an internal location, and not an external file or webpage.  You do this by using the # (hash or pound) symbol.  So if we had a table of contents, the entry for chapter one would be as follows: <A HREF=”#CH1″>Chapter One – Life as a Sponge</A><BR>

I added the BR on the end because I wasn’t using paragraphs for each item in the table of contents. <P> would normally provide too much space between entries and for the table of contents, you really just want a single carriage return.

Ending Your File

Once you have gotten to the end of your HTML file, you need to close it off.  Remember what I said earlier, that you need to close off commands that you started?  Well, two commands that you started early on in your document are the HTML and the BODY command.  Therefore, close off like this:

</BODY>

</HTML>

And that is it!  Save your file as “.html”, although for the Windows version of Notepad, it will add a “.txt” at the end.  You will need to open up Windows Explorer, and rename the file, by replacing the .txt with .html.  You can then open up the file you just created by double clicking on it in windows explorer (or Finder in Mac).  Your file will be opened up in your web browser.  This is a good way of checking for any formatting errors.  There are usually even options within your web browser that will show you a list of errors in your html file.  This is useful for quickly checking to see if you have forgotten to close any commands, or you forgot to use or close off any quotation marks in any references (very important).

Copying Your Word Book into Notepad

Now that we have gotten the basics of HTML out of the way, this is how I converted my Word file into HTML.  I know it takes a little bit of time, but the end result is perfect and efficient formatting as an e-book.

  1. I open up Notepad (or in my case, Notepad++).  I have a template already created that contains all the introductory text as per above.
  2. With my Word file open, I copy in chapters at a time, as plain text.
  3. All of the formatting within the Word file gets lost.  However, I don’t use that much formatting in any event.  Therefore, what I do is go through the plain text of each chapter, and add back in the formatting.  The most common is to just add in the Chapter headings, and <P> </P> for all paragraphs.  If I do have any separate bold or italic text (not that common), I add that in as well.
  4. Then at the end of each chapter, I add in the kindle <MBP:PAGEBREAK/> command to force a page break.
  5. I then go back to each chapter heading, and create the bookmark for that chapter.  I do the same for any other sections that I want to bookmark.
  6. Then I go up to the top of the document, after the copyright page, and add in a table of contents (you need to actually call it “Table of Contents” to ensure Kindle recognises it).  I add in each of the <A HREF> commands, pointing to the relevant chapters.
  7. As I do this, I tend to save, and have a web browser open with the file loaded.  I regularly click on “reload” on the browser and check that my changes don’t have errors and that all of the links (such as in table of contents) work.
  8. Once completed, I then upload the HTML file onto Amazon.  If I have images, I keep them in the same directory as the HTML file (for convenience) and zip all files together before uploading.

For those not used to HTML, I hope this wasn’t too difficult to follow.  I find it easier if you actually open up Notepad and just play around with it.  Feel free to experiment, click save and view what you have done in a web browser.

In my next article on this topic, I will get into content style sheets, which is where you can do some really neat formatting for your e-book.  Think of this as being akin to creating paragraph and heading “styles” in Word (its pretty much the same thing, just that you are doing it in code instead of point and click).

Oh, and if you are curious about the guide book… you can find it on Amazon here: http://www.amazon.com/dp/B00P7XAZSC

Happy writing 🙂

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s