Vreleksá Forum Index Vreleksá
The Alurhsa Word for Constructed: Creativity in both scripts and languages
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Problem with signature change

 
Post new topic   Reply to topic    Vreleksá Forum Index -> Random Chat
View previous topic :: View next topic  
Author Message
Tolkien_Freak



Joined: 26 Jul 2007
Posts: 1231
Location: in front of my computer. always.

PostPosted: Thu Sep 10, 2009 1:52 am    Post subject: Problem with signature change Reply with quote

I wish to change my signature to this: (my probably awful translation of a quote from Galileo)

「私たちに理性と知性を与えた神は、その神と同じのは、その使いを捨てるつもりがあると信じざるを得なさそうではない。」
 -ガリレオ・ガリレイ

Somehow the forum considers this above the 255 character limit, even though I count ~60.

Ittai nani suru no yo?
Back to top
View user's profile Send private message
eldin raigmore
Admin


Joined: 03 May 2007
Posts: 1621
Location: SouthEast Michigan

PostPosted: Thu Sep 10, 2009 11:20 pm    Post subject: Reply with quote

The 255 "character" limit is actually a 255 "byte" or "octet" limit.
If you look at Unicode you'll see that any character-set with more than 256 characters in it takes two bytes per character. (Unless there are more than 65536 characters in the characterset, in which case, I think, Unicode just whimpers and rolls over and sucks its thumb.) That's going to be true of any system based on Chinese logograms, for instance: such as kanji.
Also, anytime you switch from one system to another, you use two bytes (or one? or three?) as a "shift" character.
So, for all I know, then, every time you shift from hiragana to kanji, or from kanji to hiragana, should count as two bytes as well.
Write the entire thing in hiragana or katekana, leaving out the kanji. You should be able to get 125 characters in.
Or, write it all in kanji, (say Mandarin instead of Japanese) and you should be able to get 60 characters in. (Or see how many you can get in.)
_________________
"We're the healthiest horse in the glue factory" - Erskine Bowles, Co-Chairman of the deficit reduction commission
Back to top
View user's profile Send private message
kyonides



Joined: 28 Aug 2008
Posts: 301

PostPosted: Fri Sep 11, 2009 1:05 am    Post subject: Reply with quote

Well, yes, kanji, hiragana, katakana and so on are considered multibyte characters (no "single byte per symbol"-character is available).
_________________
Seos nivo adgene Kizne tikelke

The Internet might be either your best friend or your worst enemy. It just depends on whether or not she has a bad hair day.
Back to top
View user's profile Send private message
Tolkien_Freak



Joined: 26 Jul 2007
Posts: 1231
Location: in front of my computer. always.

PostPosted: Fri Sep 11, 2009 1:33 am    Post subject: Reply with quote

Ah, didn't know that. I'll try it a different way then.
Back to top
View user's profile Send private message
Tolkien_Freak



Joined: 26 Jul 2007
Posts: 1231
Location: in front of my computer. always.

PostPosted: Fri Sep 11, 2009 1:36 am    Post subject: Reply with quote

Well, all kana didn't work either, and I don't know Chinese enough to put it in that, so oh well.

Anyone want it for a translation challenge? The English text is thus:
'I do not feel obliged to believe that the same God who has endowed us with sense, reason and intellect has intended us to forgo their use.'
Back to top
View user's profile Send private message
Aeetlrcreejl



Joined: 08 Jun 2007
Posts: 839
Location: Over yonder

PostPosted: Fri Sep 11, 2009 11:20 pm    Post subject: Reply with quote

Mí èdrù mí senít díwà déonà abaket, budet, int cŏ tés èxat.
_________________
Iwocwá ĵọṭãsák.
/iwotSwa_H d`Z`Ot`~asa_Hk/
[iocwa_H d`Z`Ot`_h~a_Hk]
Back to top
View user's profile Send private message Visit poster's website
StrangeMagic
Admin


Joined: 18 Apr 2007
Posts: 640

PostPosted: Sat Sep 12, 2009 3:35 pm    Post subject: Reply with quote

Tolkien_Freak, I have changed the limit on the signature characters, hopefully it should work now. =D
Back to top
View user's profile Send private message Send e-mail
Tolkien_Freak



Joined: 26 Jul 2007
Posts: 1231
Location: in front of my computer. always.

PostPosted: Sat Sep 12, 2009 4:19 pm    Post subject: Reply with quote

Woo, thank you! Works now.
Back to top
View user's profile Send private message
Baldash



Joined: 19 May 2009
Posts: 86
Location: Sweden

PostPosted: Thu Sep 17, 2009 12:00 pm    Post subject: Reply with quote

eldin raigmore wrote:
If you look at Unicode you'll see that any character-set with more than 256 characters in it takes two bytes per character. (Unless there are more than 65536 characters in the characterset, in which case, I think, Unicode just whimpers and rolls over and sucks its thumb.) That's going to be true of any system based on Chinese logograms, for instance: such as kanji.
Also, anytime you switch from one system to another, you use two bytes (or one? or three?) as a "shift" character.
So, for all I know, then, every time you shift from hiragana to kanji, or from kanji to hiragana, should count as two bytes as well.
Write the entire thing in hiragana or katekana, leaving out the kanji. You should be able to get 125 characters in.
Or, write it all in kanji, (say Mandarin instead of Japanese) and you should be able to get 60 characters in. (Or see how many you can get in.)

That's not how UTF-8 works. UTF-8 uses a variable byte length for its characters, and it only has 128 single byte characters, the same ones as in ASCII (at least the printable ones). The eighth (or first) bit in the first byte isn't used for an additional 128 characters (like ISO-8859-1), but for indicating that it is a two byte character. The number of bytes are indicated in unary in the first byte. A two byte character has the shape 110xxxxx10xxxxxx, a three byte character is 1110xxxx10xxxxxx10xxxxxx, and a four byte character is 11110xxx10xxxxxx10xxxxxx10xxxxxx. The system could be expanded, but I think four bytes is the limit of the standard. That gives 2^21 = 2097152 theoretically possible characters (because I think 0aaaaaaa, 1100000a10aaaaaa, 111000001000000a10aaaaaa, and 11110000100000001000000a10aaaaaa are synonymous). There are no multiple character sets, it's just a single one. There are no "shift" characters that jump between any character sets. So you could shift from hiragana to kanji how often you want, without it affecting the size. But any non-ASCII character will still be at least two bytes long.

I said "UTF-8", since I don't know whether what you said is true for some other encoding, but I haven't heard about it. I'm not sure, but I think UTF-16 works the same way as UTF-8, except that it works with 16 bit blocks instead of 8 bit blocks.
Back to top
View user's profile Send private message
eldin raigmore
Admin


Joined: 03 May 2007
Posts: 1621
Location: SouthEast Michigan

PostPosted: Fri Sep 18, 2009 8:39 pm    Post subject: Reply with quote

Baldash wrote:
That's not how UTF-8 works. UTF-8 uses a variable byte length for its characters, and it only has 128 single byte characters, the same ones as in ASCII (at least the printable ones). The eighth (or first) bit in the first byte isn't used for an additional 128 characters (like ISO-8859-1), but for indicating that it is a two byte character. The number of bytes are indicated in unary in the first byte. A two byte character has the shape 110xxxxx10xxxxxx, a three byte character is 1110xxxx10xxxxxx10xxxxxx, and a four byte character is 11110xxx10xxxxxx10xxxxxx10xxxxxx. The system could be expanded, but I think four bytes is the limit of the standard. That gives 2^21 = 2097152 theoretically possible characters (because I think 0aaaaaaa, 1100000a10aaaaaa, 111000001000000a10aaaaaa, and 11110000100000001000000a10aaaaaa are synonymous). There are no multiple character sets, it's just a single one. There are no "shift" characters that jump between any character sets. So you could shift from hiragana to kanji how often you want, without it affecting the size. But any non-ASCII character will still be at least two bytes long.

I said "UTF-8", since I don't know whether what you said is true for some other encoding, but I haven't heard about it. I'm not sure, but I think UTF-16 works the same way as UTF-8, except that it works with 16 bit blocks instead of 8 bit blocks.
Thanks.
_________________
"We're the healthiest horse in the glue factory" - Erskine Bowles, Co-Chairman of the deficit reduction commission
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    Vreleksá Forum Index -> Random Chat
All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2002 phpBB Group
Theme ACID © 2003 par HEDONISM Web Hosting Directory


Start Your Own Video Sharing Site

Free Web Hosting | Free Forum Hosting | FlashWebHost.com | Image Hosting | Photo Gallery | FreeMarriage.com

Powered by PhpBBweb.com, setup your forum now!
For Support, visit Forums.BizHat.com