sunwukong wrote:I've had some success porting the Unihan DB to Excel. So far I'm able to show the 4 digit U+nnnn characters but am having trouble with the 5 digit characters.
If you are using a MS Windows system you need to insert 5-digit (U+nnnnn) characters as a surrogate pair (recent versions of Office support them).
'*----------------------------------------------------------*
'* Name : vbShiftRight *
'*----------------------------------------------------------*
'* Purpose : Shift 32-bit integer value right 'n' bits. *
'*----------------------------------------------------------*
'* Parameters : Value Required. Value to shift. *
'* : Count Required. Number of bit positions to *
'* : shift value. *
'*----------------------------------------------------------*
'* Description: This function is equivalent to the 'C' *
'* : language construct '>>'. *
'*----------------------------------------------------------*
Public Function vbShiftRight(ByVal Value As Long, _
Count As Integer) As Long
Dim i As Integer
vbShiftRight = Value
For i = 1 To Count
vbShiftRight = vbShiftRight \ 2
Next
End Function
'*----------------------------------------------------------*
'* Name : WriteSurrogate *
'*----------------------------------------------------------*
'* Purpose : Returns a surrogate pair of ISO10646:1993 *
'* : CJK Extension B codepoints *
'*----------------------------------------------------------*
'* Parameters : Codepoint Required. 5-digit string to be *
'* : converted. *
'*----------------------------------------------------------*
'* Description: Based on the C++ conversion algorithm. *
'*----------------------------------------------------------*
Function WriteSurrogate(Codepoint as String) as String
Code = Val("&H" + Codepoint)
lowsur = vbShiftRight(Code, 10) + &HD7C0
highsur = &HDC00 Or Code And &H3FF
WriteSurrogate = ChrW(Val(lowsur)) + ChrW(Val(highsur))
End Function
Did not test the code for typos.
Good luck,
Thomas