Newsgroups: misc.misc
From: hoey@ai.etl.army.mil (Dan Hoey)
Date: 15 May 90 15:48:30 GMT
Subject: Re: Roman Number Arcana

In a March, 1990 issue of Cecil Adams's column, The Straight Dope, he
fields the query:

  This is important!  What are the Roman numerals for 1990?
  Possible solutions: 1) MXM, 2) MCMXC, or the cumbersome
  3) MDCCCCLXXXX.  Help!
    --Anonymous, Chicago, Ill.

He missed his chance:  The standard abbreviation for Illinois, IL,
is a Roman numeral!  Other Roman numeral states are ID and MD.  (And
is Nebraska NE or NB?  The u*x quiz program thinks either is fine.)
Other top-level domains include CL, MIL, and MX.  (What are CL and MX?
Nation codes?  Weapons systems?)  But NIC's list doesn't include ID,
so there are probably other nation codes not listed as well.

Cecil's response begins:

  By God, this *is* urgent.  Even now sweaty movie moguls are
  undoubtedly wondering:  What the hell are we going to do about the
  date at the end of the credits?  Well much as I'd like to cash in
  selling Roman-numeral consulting services to Hollywood, this time
  you guys are on your own.  There is not now nor has there ever
  been any universally accepted method of styling Roman numerals.
  For that matter, it's only been in the last few hundred years that
  there's been any general agreement on what symbols stand for which
  quantities.

He continues, dropping little gems of information.  To quote and
paraphrase:

  Many authorities think it's only coincidence that the number
  M happened to look like the letter M (ditto for C).  It's
  unlikely they stood for the Latin *mille* or *centum*.

  As often as not, the Romans indicated 1,000 not with M but with
  the lazy-8 infinity symbol or else something along the lines of
  (I)--that is, a vertical stroke framed by exaggerated parentheses.

  The so-called subtractive principle, i.e. that IV=5-1=4, was
  used only sporadically by the ancient Romans and their medieval
  successors and never in a systematic way.

  Old documents and inscriptions include LXL, 90; XXCIII, 83;
  LXXIIX, 78; and even IIIIX, 6.  A popular German arithmetic
  textbook published in 1524 gives 99 as XCIX.

He concludes:

  So where does this leave us?  Well, if we are truly desperate for
  moral guidance, we may turn to the world of computers.  Cecil
  happens to have a desktop publishing program known as Xerox
  Ventura Publisher, an amazing bit of software thought to have been
  used originally to torture heretics during the Inquisition.  Among
  other things it will convert numbers up to 9,999 into Roman
  numerals for use as page numbers.  Punching in 1990, we come up
  with MCMXC, an unsurprising and somehow comforting result.  But if
  we then try 1999, we get MIM.  Why MIM for 1999 and not MXM for
  1990?  Lord knows.  Worse, if we enter 9,999 we get what appears
  to be IZ.  I have scoured my reference books in vain for any
  indication that Z was ever used for 10,000, which moves me to
  write the whole thing off as the product of malicious computer
  geekery, an impression that actually trying to *use* Ventura will
  certainly confirm.

  No doubt all this numerological uncertainty is distressing.  But
  look on the bright side:  It also gives us a strange and terrible
  freedom.  You can use any damn notation for 1990 you want to, and
  no one will be able to say you're wrong.  It may not give you the
  same rush as dancing on the Berlin Wall, but in post-Reagan
  America you make do with what you get.

In response to an earlier version of this message, Alan Bawden
reminded me that Common Lisp specifies a Roman numeral printer option.
In the ones I know of, each digit is represented by the Roman numeral
for its value (i.e., MCMXC for 1990).

Now that we know there are no standards, it is clearly beneficial to
use MXM and MIM, since they are shorter and snazzier than the
conservatively crippled MCMXC and MCMXCIX, and there is no ambiguity.
But more shortening is possible if we risk ambiguity with more general
subtraction.  My question is whether 1989 should have been represented
as MIXM or MXIM.

I at first thought that XIM should be preferred over IXM for 989,
since the former cannot be parenthesized wrong.  However, this does
not extend to MXIM vs MIXM, so I have come to discard
misparenthesizability as a criterion.

If the parsing algorithm is ``process numerals from right to left,
subtracting if the numeral is less than the currently accumulated
value, otherwise adding'', either MXIM or MIXM will work.  If instead
the test is ``...  subtracting if the numeral is less than the
previous numeral'', then MIXM works and MIXM doesn't.  The
friendliness principle therefore dictates MIXM and the first
algorithm.

Dan