· Try 3 Issues Free
· Magazine Customer Service
· Subscribe to FORTUNE

 

SEARCH
FORTUNE

 

 

GET
QUOTE

 

 

home

companies

ceos

investor

careers

tech

smallbiz

 

FORTUNE 500

Global 500

100 Best to Work For

America's Most Admired

Global Most Admired

100 Fastest-Growing

Small Business 100

50 Best for Minorities

MBAs' Top 50 Employers

· All FORTUNE Lists

· Download the 500

=



Executive Lifestyle

· Into the Fast Lane
· Go Topless
· Riverboarding: Rapid Transit

25 Most Powerful People in Business

· Field Guide to Power
· Quiz: Could You Make It to CEO?

Industrial Management & Technology
· Glimmers of Hope in Manufacturing

2003 Global 500: The World's Largest Corporations

· Top 75 Money Losers
· Asia's Top 50

 

 

Send to a friend
Print

THIS JUST IN
to B or not to b
capitalization and its discontents: why does my word processor upper-case Zoloft but not paxil?
FORTUNE
Tuesday, August 12, 2003
By Roger Parloff

like many people, I don't use capital letters when I type e-mail. but when I got a new computer a few months ago, it had Microsoft software that automatically capitalizes the first letters of some words. (I'm using it now.) early on, I noticed some oddities. I was writing an e-mail to a friend about the campaign-financing scandals of the Clinton administration, and I referred to a very peripheral figure named Pauline konchanalak, whose last name I inadvertently misspelled. on second reference, I happened, with equal inadvertence, to spell her name correctly. but this time the surname popped up as Kanchanalak! Microsoft knew to capitalize Kanchanalak and yet not konchanalak!

soon I was noticing other peculiarities. for instance, most over-the-counter drugs were capitalized, like Excedrin and Tylenol, but prescription drugs were much harder to predict. thus, Claritin is up, but celebrex is down. Zoloft is up, prozac and paxil down. does Microsoft ask Pfizer or merck to pay for brand-name recognition? was there an upper-case shakedown going on?

given names also held surprises. why Karen and Sharon, but not nancy or mary? Stephen, but not steven?

or consider these shockers: Muhammad, Mohamed, Buddha, and Confucius are up, but not allah, jesus, or moses!

it got worse. all these policies were unstable over time! capitalization practices changed even as I experimented with them. in fact, I've had to manually capitalize several words in this story because they've lost their capacity to do it themselves. had I worn them out? stephen and zoloft and microsoft itself no longer perform for me! even viagra is spent.

it occurred to me that many of the anomalies had something to do with word length. the longer the word, the more likely it was to be capitalized. four-letter words were always down, as far as I could tell. five-letter words, on the other hand, seemed to be right on the fulcrum. most were down—like jesus, moses, and allah—yet there were exceptions, like Xerox and Karen. maybe there was something special about the letters 'x' and 'k' that threw such words into a different category. I eagerly tested my new theory, but with bitterly disappointing results. kafka, kadar, kemal, xhosa, and hoxha. yet Kodak, Exxon, and Akaka. meanwhile, unaccountably exalted outliers sprang from my control group: Helen, Miami, Judah.

had microsoft considered the repercussions of meting out all these preferences and slights? leaving allah down while honoring Exxon—was that prudent?

I called microsoft and spoke with simon marks, the 30-year-old, London-born product manager for the microsoft office division. light streamed in, and order was restored, as marks opened my eyes to the structure of microsoftian capitalism.

marks was a gentleman too: even as he dashed my pathetically wrongheaded hypotheses, he bucked up my self-esteem. as I had so keenly picked up, he explained, the lengths of words were 'absolutely key.' and yet there was nothing determinative about the number of letters that any word contained.

'let's take a step back,' he suggested. capitalization was just a narrow aspect of the broader function performed by the spell-checking software, he explained. when microsoft's spell-check notices a word it doesn't recognize, it regards it as a possible mistyping. but it does not presume to automatically correct anything unless it feels very confident that it knows what was intended. so in most cases, spell-check merely alerts the reader to an array of possibilities, by underlining the putatively mistyped word in red. when I type moses, for instance, spell-check puts a red squiggly line beneath the uncapitalized prophet's name. (how could I have written a whole piece on this subject and failed to notice the red squiggly lines?) if I then pursue the matter further in the tools menu, I discover spell-check's ample grounds for hesitation; for all it knows, I may be trying to type mosses, moss, Moses, muses, moseys, modes, musses, muss, or mossy! only when the spell-checker's algorithms develop a much higher degree of certainty about what I am trying to say would it dare to 'auto-correct' me.

the instability I thought I had observed simply reflected my having inadvertently toggled off the auto-correct feature for certain words whenever, in the course of my research, I used the backspace key in a certain manner. (even marks wasn't sure why the auto-correct wasn't resuming for those words when I rebooted, as it was supposed to.)

as for the miraculous recognition of Kanchanalak, marks explained that microsoft is always updating the spell-check lexicon to keep up with words that are in common current usage. Kanchanalak had been in the news in about 2000, when the lexicon for my 2002 version of microsoft word was being compiled. the surname might not be recognized by earlier lexicons—or even later ones. though the lexicon is continually revised, it is not continually expanded. rather, it is maintained at a ceiling of about 200,000 words. if it becomes too inclusive—recognizing obscure words rather than interpreting them as likely mistypings—it becomes less useful for the majority of users.

similarly, the seemingly willy-nilly capitalization of drug brand names was determined by the popularity of those brands at the time my lexicon was being compiled, together with the usual issues posed by the resemblance of the brand name to other possibly intended words. microsoft certainly doesn't ask companies to pay for capitalization, marks noted, taking no offense.

and so it was that in the space of about ten minutes, marks righted my orthographically toppling world. overarching, benevolent algorithms brought harmony and meaning to it all.

except, of course, the part about how to toggle the auto-correct back on for stephen and zoloft and microsoft. but marks said he'd have a tech guy get back to me on that.

From the Sep. 1, 2003 Issue

 



SEARCH FORTUNE

 

 

 

 

HOME | COMPANIES | CEOs | INVESTING | CAREERS | TECHNOLOGY | SMALL BUSINESS

Services: Downloads | Customer Service | Conferences | Special Sections | Free Product Info
Information: Current Issue | Archive | Site Map | Press Center | Contact FORTUNE | Advertising Info

© Copyright 2003 Time Inc. All rights reserved. Reproduction in whole or in part without permission is prohibited.

Privacy Policy Terms of Use Disclaimer Contact Fortune