If you are new to Unicode on the web, viewing the source of a Unicode page may be an odd experience.
For example, the source of the first hypothesis of
“On the Sizes and Distances of the Sun and Moon”
by Aristarchus looks like this:
Τὴν σελήνην παρὰ τοῦ ἡλίου τὸ φῶς λαμβάνειν.
The browser turns it into this:
αʹ. Τὴν σελήνην παρὰ τοῦ ἡλίου τὸ φῶς λαμβάνειν.
On this site, each Unicode character is represented by a numeric code reference, which consists of “&#,” the decimal Unicode value of the character, and a semicolon. For example, small alpha is assigned the value 945, so α becomes α.
Choosing the right charset to make browsers display Unicode fonts
Browsers should always be told what character set is being used on a web page. This is done via a tag in the
head of the page, which for this page consists of
If you view the encoding of this page, it should be Western European (Windows) or Western (Windows-1252) if specified.
Windows-1252 is a limited character set which is sufficient for most plain text. Using this character set, numeric code references must be used to represent Unicode characters as described above. To include actual Unicode characters themselves in an HTML document, a different character set must be used, and the browser must be “Unicode-enabled.” Much has been written about this subject elsewhere; in short, the charset value that is usually the most relevant is UTF-8. However, older browsers were not able to represent Unicode characters in this manner. I used windows-1252 and numeric code references on these pages because when they were first uploaded, non-Unicode-enabled browsers were more common than they are today. Fonts are specified with stylesheets as all users may not have a default Unicode font specified.
See the UTF-8 version of “On the Sizes and Distances of the Sun and Moon” for an example of a UTF-8-encoded page. The encoding, if specified, will be listed as UTF-8. If you view the page source, you will see the unicode characters, not numeric code references; and if you save the page using Notepad, the encoding will be UTF-8 instead of the default ANSI. See the surrogate pair calculator on the utilities page for more information. Feedback is welcome.
Finally, here are links to some font software if you decide to pursue fontmaking yourself. (But I warn you, it is more addicting than most illegal drugs.)
Many people's “starter” program is Softy by Dave Emmett. But you may quickly outgrow it.
The program I originally used to make the glyphs in Aristarcoj was TypeTool from FontLab. The editing is more advanced than any other program in its price range.
To create the font file itself, I used Font Creator by High Logic. I now use it for the entire fontmaking process as the editing has beome more advanced in later versions. The only basic function that it lacks is hinting, which can be done with TypeTool if necessary.
Cross Font can be used to convert a .ttf file to a Mac binary which can be opened on a Macintosh. It is said to work well; but Unicode fonts require Mac OS X, which uses .ttf fonts in their native form, so this utility is not necessary for Unicode fonts.
You may already have some sort of font viewer program, but it is not likely that it will display extended Unicode ranges. The Unicode range viewer may thus come in handy.
One more must-have utility: the Microsoft Font properties extension.
Feedback is welcome about any of the information above.