www.w3.org/TR/REC-html40/struct/text.html#h-9.3.4
For information about characters, please consult the section on the 26 document character set. Many of these are typographic elements used in some applications to produce particular visual spacing effects. This specification does not indicate the behavior, rendering or otherwise, of space characters other than those explicitly identified here as white space characters. For this reason, authors should use appropriate elements and styles to achieve visual formatting effects that involve white space, rather than space characters. For all HTML elements except 30 PRE, sequences of white space separate "words" (we use the term "word" here to mean "sequences of non-white space characters"). When formatting text, user agents should identify these words and lay them out according to the conventions of the particular written language (script) and target medium. This layout may involve putting space between words (called inter-word space), but conventions for inter-word space vary from script to script. In Japanese and Chinese, inter-word space is not typically rendered at all. Note that a sequence of white spaces between words in the source document may result in an entirely different rendered inter-word spacing (except in the case of the 31 PRE element). In particular, user agents should collapse input white space sequences when producing output inter-word space. The 34 PRE element is used for 35 preformatted text, where white space is significant. In order to avoid problems with 36 SGML line break rules and inconsistencies among extant implementations, authors should not rely on user agents to render white space immediately after a start tag or immediately before an end tag. The usual meanings of phrase elements are following: EM: Indicates emphasis. CITE: Contains a citation or a reference to other sources. DFN: Indicates that this is the defining instance of the enclosed term. SAMP: Designates sample output from programs, scripts, etc. VAR: Indicates an instance of a variable or program argument. The other phrase elements have particular significance in technical documents. These examples illustrate some of the phrase elements: As <CITE>Harry S. Please refer to the following reference number in future correspondence: <STRONG>1-234-55</STRONG> The presentation of phrase elements depends on the user agent. Generally, visual user agents present 80 EM text in italics and 81 STRONG text in bold font. Speech synthesizer user agents may change the synthesis parameters, such as volume, pitch and rate accordingly. The 82 ABBR and 83 ACRONYM elements allow authors to clearly indicate occurrences of abbreviations and acronyms. Both Chinese and Japanese use analogous abbreviation mechanisms, wherein a long name is referred to subsequently with a subset of the Han characters from the original occurrence. Marking up these constructs provides useful information to user agents and tools such as spell checkers, speech synthesizers, translation systems and search-engine indexers. The content of the 84 ABBR and 85 ACRONYM elements specifies the abbreviated expression itself, as it would normally appear in running text. The title attribute of these elements may be used to provide the full or expanded form of the expression. For example, while "IRS" and "BBC" are typically pronounced letter by letter, "NATO" and "UNESCO" are pronounced phonetically. When necessary, authors should use style sheets to specify the pronunciation of an abbreviated form. This attribute is intended to give information about the source from which the quotation was borrowed. Attributes defined elsewhere * 105 id, 106 class ( 107 document-wide identifiers) * 108 lang ( 109 language information), 110 dir ( 111 text direction) * 112 title ( 113 element title) * 114 style ( 115 inline style information ) * 116 onclick, 117 ondblclick, 118 onmousedown, 119 onmouseup, 120 onmouseover, 121 onmousemove, 122 onmouseout, 123 onkeypress, 124 onkeydown, 125 onkeyup ( 126 intrinsic events ) These two elements designate quoted text. Nearly due west the broad swath of the marching Orcs tramped its ugly slot; Visual user agents must ensure that the content of the 130 Q element is rendered with delimiting quotation marks. Authors should not put quotation marks at the beginning and end of the content of a 131 Q element. User agents should render quotation marks in a language-sensitive manner (see the 132 lang attribute). Many languages adopt different quotation styles for outer and inner (nested) quotations, which should be respected by user-agents. The following example illustrates nested quotations with the 133 Q element. We recommend that style sheet implementations provide a mechanism for inserting quotation marks before and after a quotation delimited by 134 BLOCKQUOTE in a manner appropriate to the current language context and the degree of nesting of quotations. However, as some authors have used 135 BLOCKQUOTE merely as a mechanism to indent text, in order to preserve the intention of the authors, user agents should not insert quotation marks in the default style. The usage of 136 BLOCKQUOTE to indent text is 137 deprecated in favor of style sheets. The 167 SUB and 168 SUP elements should be used to markup text in these cases. The organization of information into paragraphs is not affected by how the paragraphs are presented: paragraphs that are double-justified contain the same thoughts as those that are left-justified. The HTML markup for defining a paragraph is straightforward: the 169 P element defines a paragraph. A number of issues, both stylistic and technical, must be addressed: * Treatment of white space * Line breaking and word wrapping * Justification * Hyphenation * Written language conventions and text directionality * Formatting of paragraphs with respect to surrounding content We address these questions below. It cannot contain 202 block-level elements (including 203 P itself). For more information about SGML's specification of line breaks, please consult the 207 notes on line breaks in the appendix. For visual user agents, the 224 clear attribute can be used to determine whether markup following the 225 BR element flows around images and other objects floated to the left or right margin, or whether it starts after the bottom of such objects. Further details are given in the section on 226 alignment and floating objects. Authors are advised to use style sheets to control text flow around floating images and other objects. With respect to bidirectional formatting, the 227 BR element should behave the same way the 228 ISO10646 LINE SEPARATOR character behaves in the bidirectional algorithm. Prohibiting a line break Sometimes authors may want to prevent a line break from occurring between two words. The plain hyphen should be interpreted by a user agent as just another character. The soft hyphen tells the user agent where a line break can occur. Those browsers that interpret soft hyphens must observe the following semantics: If a line is broken at a soft hyphen, a hyphen character must be displayed at the end of the first line. If a line is not broken at a soft hyphen, the user agent must not display a hyphen character. For operations such as searching and sorting, the soft hyphen should always be ignored. In HTML, the plain hyphen is represented by the "-" character (- The soft hyphen is represented by the character entity reference ­ This attribute provides a hint to visual user agents about the desired width of the formatted block. The user agent can use this information to select an appropriate font size or to indent the content appropriately. Attributes defined elsewhere * 239 id, 240 class ( 241 document-wide identifiers) * 242 lang ( 243 language information), 244 dir ( 245 text direction) * 246 title ( 247 element title) * 248 style ( 249 inline style information ) * 250 onclick, 251 ondblclick, 252 onmousedown, 253 onmouseup, 254 onmouseover, 255 onmousemove, 256 onmouseout, 257 onkeypress, 258 onkeydown, 259 onkeyup ( 260 intrinsic events ) The 261 PRE element tells visual user agents that the enclosed text is "preformatted". When handling preformatted text, visual ...
|