Thanks for the help. I've managed to get the characters in the 7XXX range to show up, but not other ones. The code I'm using is:
<html>
<head>
<content="text/html; charset=UTF-8">
</head>
<body>
The quick brown fox &#x8D14;
</body>
With this, I get a character 贔 that looks like three shellfish, one on top of two others, but they are each one stroke short.
Generally, though, I just get a dot. Any ideas on what I'm doing wrong?
Also, would it be possible to get the text dump with the Korean as well? I think the easiest format would be XXXX.Y [tab] [field type] [data] [tab], etc., with each character separated with a carriage return. Right now, the character number repeats for each type of data.
Sorry if I'm asking too much, but this is an incredible project you have, and it would be a shame (and a waste of time) to re-do it.
Best regards
Benjamin Barrett, Graduate Students
Department of Linguistics
University of Washington
students.washington.edu/bjb5
gogaku@ix.netcom.com