HTML Style Rules

Document Type

Use XHTML5.

XHTML5 (XML-serialized HTML5) is recommended for all content documents in the EPUB container.

<!DOCTYPE html>

Namespace

For XHTML documents your EPUBs, the root element html must contain an xmlns attribute for both the XHTML and EPUB namespaces.

<html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops">

XHTML Validity

Use valid XHTML in the context of EPUB3 specifications.

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops">
<head>
<title>Test</title>
</head>

Necessary in XHTML

  • Use type attributes for style sheets and scripts. (e.g., <link href="../css/wordsearch.css" rel="stylesheet" type="text/css" />).
  • Properly nest elements: <p><span class="bold">...</span></p>, not <p><span class="bold">...</p></span>.
  • Close void elements: <br />, not <br>.
  • Use end tags: <p>Text</p>, not <p>Text.

Self Closing Tags

Self closing tags are not allowed except for images. For example: <p class=”foobar” /> should be replaced with <p class=”foobar”></p>. Good references:

Semantics

Use HTML according to its purpose, and always prefer semantic elements over those that are semantically ambiguous.

Ambiguous elements

<div>
<span>
<section> (although section defines a section in a document, this is still too ambiguous to use in our projects without semantic inflection)

Semantic elements

Contents of semantic elements are defined in the specification, promote usability by humans and machines, and are vital for accessibility. Semantic elements may be familiar, like <p> for paragraphs and <table> for tables, or may be less familiar, like elements defined more recently in HTML5:

Pertinent HTML5 Examples

Element Notes on Meaning
<article> Defines an article
<aside> Defines content aside from the main document content
<figure> <figure> specifies self-contained content, like illustrations, diagrams, photos, code listings, etc.
<figcaption> <figcaption> defines a caption for a <figure> element.

Semantic Evolution

The meanings have been adjusted for some older elements since HTML 4.01. It is important to familiarize yourself with current use of elements and their intended meanings.

Element Notes on Use
<a>

In HTML 4.01, the <a> element could be either a hyperlink or an anchor. In HTML5, <a> always denotes a hyperlink, and the name attribute is not supported.

Use the id attribute rather than name for internal linking.

<hr>

Once defined as a horizontal rule, the <hr> element is now defined in semantic terms, rather than presentational terms.

In HTML5, the <hr> element communicates a thematic break (or context).

<b>

According to the HTML5 specification, the <b> element should be used as a last resort when no other element is more appropriate.

We no longer use the <b> element.

<i>

In HTML 4.01, the <i> tag was used to render text in italics. However, this is not necessarily the case with HTML5. Now, <i> defines a part of text in an alternate voice or mood, but its use has fallen out of favor.

We no longer use the <i> element.

HTML Element Reference

For a good summary reference on HTML elements and their meanings, see w3schools.com’s HTML Element Reference.

See also: W3C HTML 5 Semantic Elements

Multimedia Fallback

For multimedia, such as images and videos, make sure to offer alternative access. For images that means use of meaningful alternative text (alt) and for video and audio, transcripts and captions, if available.

Providing alternative contents is important for accessibility reasons: A blind user has few cues to tell what an image is about without @alt, and other users may have no way of understanding what video or audio contents are about either.

<!-- Recommended -->
<img src="pub_logo.png" alt="Publisher XYZ's logo." />

<!-- Not Recommended -->
<img src="pub_logo.png" />

Separation of Concerns

Strictly keep structure (markup) and presentation (styling) apart. That is, make sure documents contain only HTML, and make sure the HTML is solely serving structural purposes. Keep everything presentational in style sheets.

Entity References

Use entity references where necessary. Use numeric entities only. Examples:

  • Use &#169;, not &copy;
  • Use &#38;, not &amp;
  • Use &#160;, not &nbsp;
  • Use &#8212;, not &mdash;
  • etc.

The following characters must be converted to html entities:

  • & (&#38;)
  • > (&#62;)
  • < (&#60;)

During the clean and code process, you may find it helpful to replace hex code and and decimal code html entities with the corresponding characters. Some commonly occurring entities in our ePub source files include:

  • Em dash () - Find &#x2014; or &mdash; or &#151
  • En dash () - Find &#x2013; or &ndash; or &#150;
  • Copyright symbol (©) - Find &#x00A9; or &#169;
  • Right single quote () - Find &#x2019; or &#8217;
  • Right double quote () - Find &#x201D; or &#8221;
  • Left single quote() - Find &#x2018; or &#8216;
  • Left double quote () - Find &#x201C; or &#8220;

A helpful source for locating html entities and unicode characters can be found at &what.

See also: Character entity references in HTML on Wikipedia.