2 yabba May 27, 2008 09:16

It should work to make sure that all your tables are set to UTF-8 or whatever you are using. For what it's worth I've been blogging using Japanese characters for quite a while.
ALTER TABLE table_name CHARACTER SET utf8;
That might not change the collation on the fields so be careful.
It might also be good to set the database's default encoding to UTF-8 as well. Something like the following should do that.
ALTER DATABASE db_name CHARACTER SET utf8;
Already set to utf-8 (or is it utf8 ... or UTF8?) in the database, compliments of Afwas' well-documented method to make that happen. This is an installation that has no past so it was easy to make it be purely utf8 yah?
Anyway the problem exists in this utf8'd installation so I figure there must be something b2evolution does to a post with non-english to get it to store properly.
hmmm... I can just follow the comment path I guess because when ¥åßßå leaves a comment on my web it stores and plays back as those characters. This would be an old-fashioned "not utf-8" database by the way. Except when I had to restore stuff due to a mistake on my part. I then got junk where his name belongs so I went in and manually altered the database to have the right stuff in it.
hmmm... I can also fake a messages.po file and have "¥åßßå" be the translation for something and see if I can store that. I noticed the issue with the French translation of Summary Demo because it was the earliest entry in the file and therefore the earliest for me to see. I did not dig deep and find if ALL non-english characters get bonked.
Must be something it is doing to make it happen!
#: ..\..\..\sanity.php:1
#: ..\..\..\displaced.php:7
msgid "¥åßßå"
msgstr "Réplace Usér and Réboot"
:D
¥
Nah man msgid is the English bit. msgstr is the fancy character bit. Like this:
#: ..\..\..\in\sanity.php:1
#: ..\..\..\displaced\oxygen.php:7
msgid "All your base are belong to us"
msgstr "Usér ¥åßßå friéd sérvér"
BTW YES I will use that in my test file. Drop it into the top of the French translation, then hack off about 9000 lines from the end of it. When I first ran my little plugin I thought I was actually going to pull a ¥åßßå and take down my server given how long it was taking to read a line and figure out what to do with it ;)
I think this is going to be a dead end for me.
I can post all sorts of characters no matter what my locale is set to, and I can see them stored in the database as the actual characters, but I can't store them as they are via a fairly straightforward query.
I tried echoing the values before and after "DB->escape" and they do not show properly that way.
When the actual output file is stored in a format that will open up in a viewing window (like .php for _global.php) I get ? for non-English characters, but when I make it be a file that automagically gets downloaded the characters are properly visible. Creating the file happens before trying to fill in the database, but hey I was looking for clues and found that I can not store the file in the format it needs to be used in ... but I haven't made .php files be forced to a download which will probably fix that problem.
I looked into (but haven't played with) how I can take advantage of b2_htmltrans in conf/_formatting - it seems to be all about displaying what the database has in it. Same with the convert_chars and convert_charset functions - useful for displaying but doesn't help me actually store stuff like à or é or whatever.
Obviously b2evolution can store these characters in the database, but I sure can't figure out how it gets done :'(
Can you run this query and check if the blog name is correct?
mysql_query("UPDATE evo_blogs SET blog_name='йёщяъ' WHERE blog_ID=1");
You may want to save the file in UTF-8 first.
Wow those are really cool characters! English is boring :(
Okay I'll give it a shot, but I don't understand "save the file in UTF-8 first". What file? I will backup the database first, then I figured on doing a copy/paste from the forum to a file with enough stuff to connect to the database. I guess the part I don't understand is how I would save a file with a charset ?
Wow those are really cool characters! English is boring
There are some more funny letters, but I'll keep them for other queries ;)
You need to save the file (with code to connect to db, where you are going to paste this query) in UTF-8 or you'll see ??????. Open in text editor, select save as and choose UTF-8.
I have no Idea how to do it on Macs :( , but I'm pretty sure your editor can do it.
Actually I just found out the hard way that good old notepad was the friendliest for this task. Uploading and testing ... NOW.
Okay that was interesting. So okay notepad saved the file as UTF-8 after telling me I would lose my stuff if I saved it ANSI style. I then opened the file with html kit (my preferred editor) and saw your characters as pretty much junk. Specifically, I see йёщÑÑŠ in that editor, but re-opening in notepad shows me йёщяъ so I feel confident the characters are stored properly in my quick_hack.php file.
Running that file results in the following on my monitor: 
There is, of course, nothing other than that to view when I view-source because quick_hack.php was not designed to actually output anything.
Now when I view the database I get the following in that field: йёщÑÑŠ which, of course, is what is showing up when I actually look at the blog.
For the record: the collation on that particular field in that particular table is utf8_unicode_ci, as it is for all other fields that have anything in that column and all tables in that database. This is what I get after having followed the instructions that Afwas rounded up and documented concerning how to make a new b2evolution blog be friendly with UTF-8.
I forgot one more thing...
Try to add this stuff after select_db
mysql_query("SET NAMES 'utf8'");
mysql_query("SET collation_connection='utf8_general_ci'");
mysql_query("SET collation_server='utf8_general_ci'");
mysql_query("SET character_set_client='utf8'");
mysql_query("SET character_set_connection='utf8'");
mysql_query("SET character_set_results='utf8'");
mysql_query("SET character_set_server='utf8'");
If it won't help try to paste this query in phpMyAdmin.
HOORAY!
Hey if you get rid of the fourth character it probably stops being a word but it also looks sort of like it could be "noob" ;)
Can you make an explanation of what these 7 lines do? Also for my purposes here would I need to do these SET commands once in the plugin or each time I am about to perform another INSERT or UPDATE command?
But yeah HOORAY because now I know of a method that actually lets me store non-English characters in a database without resorting to posting it via the back office.
Great!
These lines do the same thing as $db_config['connection_charset'] = 'utf8';
I think you only have to do it once right after you select a db.
And you can delete some of these lines, try to leave the first line only.
Cool. I'll report back any and all successes I have with this. At the moment my mission has changed to "mowing the lawn". Funny how something that gets no water still manages to grow enough to become an issue...
It might be worth changing your table to UTF-8 ..... dunno if you need to fake a locale when storing to make it all happen because english works for me ;)
¥