[Home] [Downloads] [Search] [Help/forum]

Gammon Forum

See www.mushclient.com/spam for dealing with forum spam. Please read the MUSHclient FAQ!

[Folder]  Entire forum
-> [Folder]  MUSHclient
. -> [Folder]  General
. . -> [Subject]  Documentation for IBM extended characters
Home  |  Users  |  Search  |  FAQ
Username:
Register forum user name
Password:
Forgotten password?

Documentation for IBM extended characters

It is now over 60 days since the last post. This thread is closed.     [Refresh] Refresh page


Posted by Anakhronizein   (10 posts)  [Biography] bio
Date Thu 19 Jan 2017 02:34 PM (UTC)

Amended on Thu 19 Jan 2017 04:22 PM (UTC) by Anakhronizein

Message
Hi,

I was wondering if there is documentation related to the IBM extended characters (cp437, https://en.wikipedia.org/wiki/Code_page_437) and what (escape sequences?) I need to send to MUSHclient from my server to translate to these characters?

That is, I was wanting to implement cp437 and I am unsure what is the best way to get MUSHclient to recognize the data I send. I assume there is some standard for this since a few MU*s already implement unicode or cp437 or parts of them.


Thanks in advance!

EDIT: it occurred to me it might be enough just to send the characters themselves. Is this the usual method?
[Go to top] top

Posted by Fiendish   USA  (1,641 posts)  [Biography] bio   Global Moderator
Date Reply #1 on Thu 19 Jan 2017 07:50 PM (UTC)
Message
Convert to UTF8.

https://github.com/fiendish/aardwolfclientpackage
[Go to top] top

Posted by Anakhronizein   (10 posts)  [Biography] bio
Date Reply #2 on Fri 20 Jan 2017 02:55 PM (UTC)
Message
Fiendish said:

Convert to UTF8.


I am using PennMUSH as the MU*base, and allowing for all 255 characters to show yields that a few are not working. Namely:
2,7,9,10,13,27.

How would one go about converting to UTF-8 in the case of PennMUSH? Using safe_chr(<cp437 int>) as my function output, I would assume that the changes to UTF-8 would be to whatever processes the buff character.
[Go to top] top

Posted by Nick Gammon   Australia  (21,322 posts)  [Biography] bio   Forum Administrator
Date Reply #3 on Fri 20 Jan 2017 09:42 PM (UTC)
Message
Characters less than 32 are control characters. For example 10 is a linefeed and 13 is a carriage return.

I have a feeling that someone recently got CP437 to work using an appropriate font in the output window. I might be wrong, and can't find the post right now.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by Nick Gammon   Australia  (21,322 posts)  [Biography] bio   Forum Administrator
Date Reply #4 on Fri 20 Jan 2017 09:44 PM (UTC)
Message
Here it is:

http://www.gammon.com.au/forum/bbshowpost.php?id=13797

Different font, same general idea.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by Anakhronizein   (10 posts)  [Biography] bio
Date Reply #5 on Sat 21 Jan 2017 08:23 PM (UTC)
Message
Nick Gammon said:

Characters less than 32 are control characters. For example 10 is a linefeed and 13 is a carriage return.

I have a feeling that someone recently got CP437 to work using an appropriate font in the output window. I might be wrong, and can't find the post right now.


10 being linefeed makes sense as it goes to a new line after being used. The rest show up blank. I am using terminal right now which has full support for cp437, indeed these are all the characters I can already see:
http://i.imgur.com/tCcNkIJ.png

As you can probably tell, 9 is a tab, 10 is linefeed. The other problem-characters simply do not appear. I think the problem comes before the client (and thus the typeface), probably how PennMUSH interprets the buffer character of the integers in question. I will check M*U*S*H to see if anyone has a solution.
[Go to top] top

Posted by Nick Gammon   Australia  (21,322 posts)  [Biography] bio   Forum Administrator
Date Reply #6 on Sat 21 Jan 2017 08:25 PM (UTC)
Message
They aren't really problem characters. You shouldn't be trying to "see" control characters. If you do attempt it, the server may discard them.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by Anakhronizein   (10 posts)  [Biography] bio
Date Reply #7 on Sat 21 Jan 2017 08:53 PM (UTC)
Message
Nick Gammon said:

They aren't really problem characters. You shouldn't be trying to "see" control characters. If you do attempt it, the server may discard them.


Sorry, maybe I wasn't clear with my goal. My goal is to see the entirety of cp437. For example, 3 is a heart character. I want to see the heart character, not actually what PennMUSH is currently trying to interpret the third character as (presumably a control character).
[Go to top] top

Posted by Fiendish   USA  (1,641 posts)  [Biography] bio   Global Moderator
Date Reply #8 on Sat 21 Jan 2017 09:30 PM (UTC)
Message
If you declare a specific decidedly-non-standard codepage, then everyone who connects must initiate MUSHclient with that codepage in mind and modify mushclient on their end. If you're controlling the server, convert to UTF8 which MUSHclient already supports.

https://github.com/fiendish/aardwolfclientpackage
[Go to top] top

Posted by Nick Gammon   Australia  (21,322 posts)  [Biography] bio   Forum Administrator
Date Reply #9 on Sun 22 Jan 2017 03:02 AM (UTC)
Message
Anakhronizein said:

Sorry, maybe I wasn't clear with my goal. My goal is to see the entirety of cp437. For example, 3 is a heart character. I want to see the heart character, not actually what PennMUSH is currently trying to interpret the third character as (presumably a control character).


I don't see how your goal will be accomplished. For example, character 10 is a linefeed. You at least need that to get through as a linefeed or you will never have any new lines.

Therefore the inverted "O" symbol from the page you referenced ( ◙ ) will never be able to be shown (thus your goal "My goal is to see the entirety of cp437." will fail).

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by Nick Gammon   Australia  (21,322 posts)  [Biography] bio   Forum Administrator
Date Reply #10 on Sun 22 Jan 2017 03:04 AM (UTC)
Message
From the Wikipedia page you mentioned:

Quote:

Although the ROM provides a graphic for all 256 different possible 8-bit codes, some APIs will not print some code points, in particular the range 1-31 and the code at 127. Instead they will interpret them as control characters. For instance many methods of outputting text on the original IBM PC would interpret the codes for BEL, BS, CR and LF. Many printers were also unable to print these characters.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by Nick Gammon   Australia  (21,322 posts)  [Biography] bio   Forum Administrator
Date Reply #11 on Sun 22 Jan 2017 03:07 AM (UTC)
Message
MUSHclient itself (in the input processing state machine) either ignores \r (carriage return) or uses it to clear the current line.

Similarly the \a character (bell/alert) is used to play a bell sound.

Tab (\t), ESC (0x1B) and IAC (0xFF) also have special treatment.

Thus those characters won't make it to the output rendering system.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by Anakhronizein   (10 posts)  [Biography] bio
Date Reply #12 on Sun 22 Jan 2017 04:24 AM (UTC)
Message
Nick Gammon said:

I don't see how your goal will be accomplished. For example, character 10 is a linefeed. You at least need that to get through as a linefeed or you will never have any new lines.


Perhaps I am still being unclear. I am not asking "how can I replace the control sequences with cp437 characters", but instead I am asking "what can I do to show the cp437 characters I am missing so I can hack this into my local functions for PennMUSH".

But now I realize this is not something MUSHclient handles, it is to do with PennMUSH. Thanks for your help in any case.
[Go to top] top

Posted by Fiendish   USA  (1,641 posts)  [Biography] bio   Global Moderator
Date Reply #13 on Sun 22 Jan 2017 12:20 PM (UTC)

Amended on Sun 22 Jan 2017 12:27 PM (UTC) by Fiendish

Message
Anakhronizein said:

How would one go about converting to UTF-8


Well, I suppose you could use libiconv. http://pubs.opengroup.org/onlinepubs/9699919799/functions/iconv.html

Or, you basically* find&replace according to https://en.wikipedia.org/w/index.php?title=Code_page_437&oldid=565442465#Characters

* - slightly more complicated than that, since you're often replacing 1 byte with 2 bytes.

In Windows you could also call MultiByteToWideChar to convert from CP437 to UTF16, and then WideCharToMultiByte to convert that to UTF8.

https://github.com/fiendish/aardwolfclientpackage
[Go to top] top

The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).

To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.


3,414 views.

It is now over 60 days since the last post. This thread is closed.     [Refresh] Refresh page

Go to topic:           Search the forum


[Go to top] top

Quick links: MUSHclient. MUSHclient help. Forum shortcuts. Posting templates. Lua modules. Lua documentation.

Information and images on this site are licensed under the Creative Commons Attribution 3.0 Australia License unless stated otherwise.

[Home]


Written by Nick Gammon - 5K   profile for Nick Gammon on Stack Exchange, a network of free, community-driven Q&A sites   Marriage equality

Comments to: Gammon Software support
[RH click to get RSS URL] Forum RSS feed ( https://gammon.com.au/rss/forum.xml )

[Best viewed with any browser - 2K]    [Hosted at FutureQuest]