Typographic refinements of thesis design in LaTeX

Long texts require a setting not unlike the way a marathon is run. Everything has to be comfortable—once you’ve found your rhythm, nothing must disturb it again. If you have text that is going to require long-distance reading, design it so the reader has a chance to settle in. — Spiekermann, E. & Ginger, E. M. 1993. Stop Stealing Sheep and Find Out How Type Works.

LaTeX is increasingly being embraced as one of the industry standards for professional-grade typesetting of documents of all ilks, not least for scientific literature, and has the additional virtue of being open source and freely available. The default settings of LaTeX are of very high standards, but manual intervention in some cases will bring it to an even higher level and transform it into an expression of one’s personal character while communicating the message effectively.

The concepts are applicable to all professional typesetting systems, although I mention LaTeX-specific packages or commands for implementing some of these refinements. LaTeX source code is beyond the scope of this article. If you have trouble making a desired refinement, I recommend searching the problem on the Internet. Also visit these great online communities for learning and discussing the finer things in LaTeX and computer programming, tex.stackexchange.com and stackoverflow.com.

Body text

LaTeX’s default typeface, Computer Modern, is very much to my liking but it has one or two details that aren’t perfectly to taste (mentioned later). My ideal of a body-text typeface is one that has a relatively smaller difference in stroke thickness between the horizontal and vertical parts, giving full-bodied letters more comfortable on the eye for long-distance reading, yet retaining an athletic figure. It should also have a relatively small x-height, allowing the correspondingly long ascenders and descenders to express themselves freely. A typeface that satisfies these criteria is the one cast by William Caslon c. 1722, one of the oldest English typefaces. Integral to the beauty of the Caslon typeface are its earthy wedge-shaped serifs, and adding to it are exquisite yet unobtrusive features like the cropped apex of A and the long flourish of Q. Indeed, an enduring maxim of British printers was ‘when in doubt, use Caslon’, a testimony to its dual qualities of utility and visual appeal. Original specimen sheet.

Adobe Caslon Pro cropped apex capital letter A

The distinctive cropped apex of the letter A in Caslon.

Adobe Caslon Pro long tail capital letter Q

The letter Q. These text samples also show the wedge serifs and robust strokes of Caslon.

Of the various modern-day digital incarnations of Caslon, I use Adobe Caslon Pro by Carol Twombly, which came with my student edition of Adobe Creative Suite. Other typefaces I have considered include Garamond and Jenson. Garamond is very elegant but the strokes are somewhat too delicate. The venerable Jenson, being an even more ancient typeface than Caslon, has a subtle affinity to handwriting that I think would be better reserved for more poetic texts than my scientific dissertation. In addition, Minion Pro and Palatino are very popular with those seeking an alternative typeface to the LaTeX default, but their aristocratic strokes and serifs don’t quite exude the organic down-to-earthness of Caslon that appeals to me and synergizes with the objective style of scientific reporting. Flowchart for typeface selection.

Regardless of whether the default LaTeX font is used, there are some typographical refinements that are not implemented automatically or by default, but are available for truly professional typesetting. They include microtypographical refinements like font expansion, letterspacing (as distinct from kerning) and margin kerning, all of which can be activated globally using the `microtype’ package, and selective ligatures. LaTeX uses certain ligatures automatically; such ligatures have to be disabled manually where they occur between morphemes, otherwise confusion may arise as to the meaning of the word. Some ligatures are off by default, for example the ae ligature, which should be manually enabled in certain words. List of legitimate ae and oe ligatures.

LaTeX wrong automatic cross-morpheme fl ligature in the word `briefly'

Automatic ligation should be disabled in certain words; in this example the fl ligature misleads the eye into reading `brie-fly’ instead of the correct `brief-ly’.

And the score of Beethoven’s Hammerklavier Sonata does not need bolder notes to mark fortissimos nor fractured notes to mark the broken chords. The score is abstract code and not raw gesture. — Bringhurst, R. 1996. The Elements of Typographic Style. (Widely regarded as the seminal work on the subject.)

Section headings

There are various ways of formatting heading text—italics, small caps, boldface, for example. The first two don’t particularly appeal to me as they are already being used in the body text and are insufficiently distinctive. Boldface can be pleasing and effective if used at tasteful sizes with tasteful amounts of spacing above and below, like the LaTeX default, but I wanted something more sophisticated.

Meet my favourite sans-serif typeface, News Gothic (Morris Fuller Benton, 1908). The slender strokes and tallish aspect give it an air of sophistication and unpretentiousness suited to the formal task of articulating the headings of a thesis, and differentiate it from the more popular sans-serifs like Arial and Helvetica (one of the relatively `dark, coarse and tightly closed’ typefaces described by Bringhurst as `cultural souvenirs of some of the bleakest days of the Industrial Revolution’). Although the letters of News Gothic are relatively narrow, they are comfortably letterspaced for optimal legibility. The weight is a little too light at normal text size but this becomes an asset at larger heading sizes where it stands out from the text by virtue of its size without creating overpowering and disruptive dark struts across the page. In my opinion, News Gothic and Caslon make an electrifying pair, the realism of 20th-century American newsprint juxtaposed with the baroque splendour of 18-century European scholarly texts.

LaTeX News Gothic heading format

Section headings in News Gothic typeface, showing reduced distance between successive baselines and manual non-hyphenated line breaks within multiline heading, all capitals, and vertical alignment of heading and subheading.

I use all capitals to distinguish the headings more clearly and also to create a subtle horizontal bar effect that grounds the heading solidly and complements the fluctuating topography of the main text. It is also important to override LaTeX’s smart autohyphenation in multiline headings and manually break the lines. They should be broken at grammatically appropriate junctures and such that the heading as a whole has a compelling geometry, essentially meaning that no two or more lines within the heading should be approximately the same length as one another. Before getting caught up in these manipulations, see to it that you have already made the heading as concise as possible.

Finally, reduce the leading, or distance between successive baselines, within the heading so that it becomes a single unified object that has no chance of being obfuscated in the vertical flow of the body text. The amount by which the leading is reduced should obviously balance well with the letterspacing of the heading, and in my opinion should not in absolute measurements be too different from the leading of the prose. Customarily, the height of the interlinear channel, or horizontal empty space between two successive lines, should be greater than the word spacing, in order to minimize the occurrence of rivers. As is evident in the illustration above, however, I have adjusted the leading such that the interlinear channel height is somewhere between the letterspacing and word spacing rather than greater than both of them. It seems to work here because there are at most only two or three lines in the heading, and it actually feels more geometrically homogeneous and pleasing when small and large spaces within a flowing line are `averaged out’ with medium spacing between lines. Furthermore, the constant height of the capitals helps the eyes stay on the line despite the tight leading, where ascenders and descenders of lowercase letters would have caused the eyes to stray from one line to another.

Bibliographic style

One of my favoured uses of (real, not fake) small caps is in author names, in both the body text and the bibliography. It gives an additional degree of recognition to the giants on whose shoulders I teeter. The small caps of Caslon are about as tall as the x-height, which I find appealing. Many typographers prefer higher small caps for more emphasis but I don’t like that because it disrupts the texture of the page, and when only lowercase small caps are used in an acronym one can’t really tell whether they are small caps or not; this is the one thing that bothers me about the otherwise near-perfect Computer Modern.

I use the small-cap author names with the ampersand for an even more elegant disposition. The use of the ampersand also reduces confusion when stringing references together in the text e.g. `A & B and X & Y’ versus `A and B and X and Y’. Even when using numbered citations as I do, it is occasionally necessary to spell out the authors on grammatical grounds.

bibliography reference citation style

My personal reference style with small caps, ampersand, colon, and thin space between multiple initials of a given author. Note the difference between the hyphen and the en dash.

Another tweak to the bibliography is the change of punctuation between the journal volume and page numbers from comma to colon. There are too many commas and/or periods already in most reference styles; the colon is not only a welcome relief but helps the eye track the volume:page component more readily.

One more refinement is the use of a thin space between the author initials. A regular space would make the capitals towering landmarks unto themselves, while leaving no space at all would result in a jungle. You might ask why `USA’ would be ok; that is because it refers to an entity that is widely known, and our minds have come to associate the overall shape of the acronym with that entity.

All of the above can be customized globally using the `biblatex’ package; there is no need for any manual formatting.

Typeface in scientific plots

I tend to plot graphs in my data analysis software and export them as pdf vector graphics to embed in LaTeX (occasionally I use R’s `tikzDevice’ package to generate native LaTeX graphics code directly from R, but that has difficulties with syntactically complicated plots like 3D surfaces with shadows and transparency). The default typefaces of those plots, however, may not go well with your chosen typeface in LaTeX. The default in the R Language for Statistical Computing, for example, is Helvetica, which clashes with the News Gothic in my headings and doesn’t match that well with Caslon either. I use Computer Modern instead of Helvetica, as the former is used in my math environments in LaTeX. It could be argued that while the numerical axis labels in the plot would rightly be in Computer Modern, the axis titles being text should really be in Caslon, but there are two factors working against this proposition: (1) it seems very difficult to make the R pdf graphics device use multiple font families simultaneously (`tikzDevice’ can do this since it uses native LaTeX); and (2) Caslon in the bottom x-axis would blend in with the caption text, creating the impression that the figure proper starts from above the x-axis, which is visually discomforting.

R lattice wireframe 3d plot

Computer Modern font in wireframe plot in R. I modified the source code of the `wireframe’ function to shorten the tick marks, right-justify the tick labels (needed because of the minus signs) and prevent the tick labels from overlapping with one another and with the tick marks (as the original function does not provide direct arguments for adjusting these parameters); this applies even if you are using the default Helvetica.

R uses the `extrafont’ and `fontcm’ packages to enable font families not automatically available in R, and MATLAB and Octave have their own routines as well.

Numbers and mathematics

Typography’s principal function is communication, and the greatest threat to communication is not difference but sameness. — Robert Bringhurst

You would have seen in the earlier illustrations that Caslon has old-style numerals with ascenders and descenders for the appropriate typesetting of numbers in the body text. These are not so suitable for mathematical expressions. Caslon also has the `capital’ numerals (properly known as lining figures) that are ok for math, but it does not have all the other mathematical symbols my equations require. Therefore I typeset all math in Computer Modern. If you do this you should ensure that any inline mathematical expressions in the prose, even an isolated number like `0.57′ that refers to some parameter value used in a model, are typeset in math mode using \[\] for example. Sometimes the distinction is quite subtle; for example in the phrase `simulated for 1000 time steps’, I consider `1000′ to be text, not math, and use old-style numerals accordingly.

Old-style numerals (text figures) and lining figures

Different kinds of numerals for different kinds of numbers. The visual contrast helps the mind put each number in the appropriate context and makes reading more efficient.

When one uses different typefaces for text and math, there is also the question of what typeface to use for any normal text embedded in equations. The convention is to use the typeface of the body text, but I run into an inconsistency here because I have opted to use the math typeface for text within figures, as explained above. It would therefore be more consistent to go against convention and use the math typeface for text within equations. There is no intrinsic problem with this—indeed I think it looks more harmonious—because Computer Modern is equally at home in text and math, as long as one calls `\textnormal’ within the math environment telling LaTeX to kern the characters as words rather than a string of mathematical symbols.

LaTeX math mode computer modern textrm textnormal

Using the math typeface for text (`consumers’) within math environments can produce a consistent look, as long as you tell LaTeX that it is a word rather than a string of variables. If it thinks they are variables it will space them too far apart. Notice that the equation number at far right is not in the math typeface.

There is actually a third kind of numeral, the so-called superior figure, with a thicker stroke suitable for footnote markers. LaTeX default uses mathematical superscripting of normal figures for footnote markers, which is not ideal because the normal figures are too thin at small sizes and placed too high for body text. Unfortunately too, most fonts do not have superior figures at all or have only a small subset. Adobe Caslon Pro is one of the few that provide the full set of superior figures from 0 to 9. If your font has superior figures, they can be turned on automatically for footnotes simply by loading the `xltxtra’ package.

Adobe Caslon Pro LaTeX xltxtra superior figure footnotemark

Correct typesetting of footnote marker using thick-stroked superior figure positioned lower than superscript position.

That’s all, folks

By all means break the rules, and break them beautifully, deliberately and well. — Robert Bringhurst

Hopefully I have given you a flavour of my selective typographical tweaks, revealing some of my own idiosyncrasies in the process and motivating you to conceive your own work of typographic art. There are substantial aspects of macroscopic document structure and design that I have left out of the discussion, but the same paradigm applies—let consistency, logic and beauty be your guiding lights.

