HP3000-L Archives

August 2001, Week 3

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Wirt Atmar <[log in to unmask]>
Reply To:
Date:
Wed, 15 Aug 2001 15:56:08 EDT
Content-Type:
text/plain
Parts/Attachments:
text/plain (27 lines)
Larry asks:

> Does anyone have a code translator from Unicode to ASCII to be used in an
>  automated conversion process?

For the lower-register ASCII codes (dec 0 to 127), ASCII and Unicode are
identical, with the exception that Unicode is a 16-bit representation and
ASCII is only 8 bits. To convert one to other, all you need to is strip off
or add eight zeroes to whichever code value you're starting with.

For the upper-register ASCII codes (dec 128 to 255), the same basic rules
apply with the caveat that Unicode uses the ISO Latin-1 symbol set, which is
the same that HTML uses. What that means is that it's not exactly equivalent
to the Windows extended Latin-1 code set. In ISO Latin-1, there are 32
"blank" entries that are unused in HTML/Unicode/ISO Latin-1 but which are
filled in in Windows.

The Windows characters appear elsewhere in the Unicode set, but they're not
where Bill put them. Most likely this won't be a problem, but that's the only
translation that's actually necessary. Everything else is just a byte
re-registration problem.

Wirt Atmar

* To join/leave the list, search archives, change list settings, *
* etc., please visit http://raven.utc.edu/archives/hp3000-l.html *

ATOM RSS1 RSS2