View Full Version : Minor (?) Display Problem
Hartmuto
9th of April 2007 (Mon), 19:23
Hi,
I've managed to upgrade 1.5 to 2.0 to 2.02 in a second attempt. Everything seems to work fine now. Thanks to Pekka for this great software and this forum for the invaluable support. I have a small display issue left to fix.
There is a strange character between the label and a form field on the list page in an exhibition. This happens with all fields and shows up in IE7 as and in Firefox as w . I've checked in other galleries but I have not seen it anywhere else.
Could someone help please?
Hartmuto
12th of April 2007 (Thu), 19:05
Hmmh, no responses.
Does nobody has this problem or is it too minor? Thanks.
jeronimo
13th of April 2007 (Fri), 04:41
You can check the language files if there is anything strange there
The names (display, category and listing type) are read from one of the language files. There might me someting wronk there.
Hartmuto
16th of April 2007 (Mon), 09:27
Thanks for the tip jeronimo. I've checked the language file but can't see anything there. The translations appear to be correct. Also, it seems that the problem is not with the translated word, but rather the space (? or something) between the word and the input field. I also checked the database and the collation in all tables in there says "latin1_swedish_ci ". Could this be a problem?
Pekka
16th of April 2007 (Mon), 13:32
I recall I saw this problem during developement but I can't remember what it was.... maybe reuploading original language files as ASCII would help?
If you edit language files, do them with EE editor or some other simple text editor so that not special control characters are added by editor.
Database collation does not matter, those texts are not coming from database.
Hartmuto
16th of April 2007 (Mon), 23:35
Thanks Pekka. I did not touch the language files, but your suggestion got me to look at the character sets. Bingo, here was the problem. My browsers (IE7 and Firefox) both had the character set defined as UTF-8. I changed the setting to 'Auto select' and the correct character set (Western European ISO) got loaded.
I wonder, however, if there is still a problem somewhere in the code. Without 'Auto Select' my webpage loads with the UTF-8 character set, but when I load for example Pekka's gallery it loads with the iso-8859-1 character set.
NB: I have installed plain vanilla and have not modified any stylesheet or templates.
Hartmuto
16th of April 2007 (Mon), 23:45
It was perhaps premature to say the problem has been fixed. With the above change to "Auto Select' in the browser, when I changed the language from English to German, I had the problem again, ie. character set showed UTF-8. When I changed back to English the character set remained UTF-8. Can someone perhaps point me to where else I could look? Many Thanks.
Hartmuto
17th of April 2007 (Tue), 21:22
I solved it this time.
The problem was that Apache had a default directive switched on to use UTF-8. I commented that out so that the directives from the meta tag in the HTML header is used, and it works.
Thanks everyone for their help.;)
MMCM
23rd of April 2007 (Mon), 12:24
After a long pause without enabling my test EE 2.x gallery for real use, I'm currently upgrading my EE 1.5RC4 gallery to EE 2.xx again (with a lot of modifications to upgrade.php, to keep most of the site structure intact - I'll post those modifications later), and this "bug" occured on my site too.
I'm using UTF-8 throughout the whole site, DB is UTF-8, all language files converted to UTF-8, and DB connections converted to UTF-8 too. This is the only safe way to handle multilanguage characters, german and polish accented characters can't be shown together in ISO-8859-1 or ISO-8859-2. UTF-8 seems the only way to go.
I tracked down the problem to following files:
.../templates/pages/ee_2_default/list/default_XHTML_content.php
.../templates/pages/ee_2_default_UDM/list/default_XHTML_content.php
and many more...
Those files do contain a 0xa0 character, which will not show correctly when using UTF-8.
This character can be used instead of a (no-breaking-space), but it doesn't seem to work with UTF-8. After replacing it with the html nbsp symbol using the following command, the problem was fixed.
find . -name "*.php" -exec sed -i 's/\xa0/\ /g' {} \;
I would recommend doing this for the next update, because this solution should work with every character set.
best regards
Martin
P.S. still waiting for a subversion server with trac for EE 2.x :-(
Have a look at https://dev.openwrt.org/ for a good example.
Hartmuto
23rd of April 2007 (Mon), 17:43
I thought it must be something like what you discovered, but I did not know where to start looking or what to look for. Is there an easy way to do a global search & replace on all the files? I agree it would be good if this could be changed in a next release as it would make the code more compatible with character sets.
UweB
26th of April 2007 (Thu), 03:20
This strange caracter also appears when i enable my russian language files in EE2. I've been looking and trying to find out why it can be seen only with UTF-8 language files... i also tried different combinations of saving the language files and the language setting of the browser and so on...
...it's great that you found a solution so that exotic language files can be used in EE2... :-)
MMCM
26th of April 2007 (Thu), 04:16
Uwe, did you ever look at your EE database tables with phpmyadmin? Are those russian characters stored correctly in the database?
EE 2 does ALWAYS connect to the DB using the iso-8859-1 character set, or whatever the default character set is configured for mysql (no "SET NAMES" sql query found throughout the whole EE code), independent of the character set configured in the language file. But: The character set in the language file is correctly passed to the browser, so in case of the russian language, the browser displays and returns user input as UTF-8, but EE stores and retrieves data from mysql using the iso-8859-1 character set. This will work as long EE is the only program to access the EE database, AND you don't change the character set of a language file, or copy text from one language to another. In that case, the DB contents will not be converted to the new character set, and displayed incorrectly.
In my (still running) EE 1.5RC, I use a lot of sql scripts (running directly on the server), to copy text between languages, so I have to enter text only in one language, copy all that text to another language and modify only those parts needed to change.
And that wont work with UTF-8 data stored as iso-8859-1 ;-)
To handle character sets correctly, EE should use the character set specified in the active language file to connect to the database, and the character set of the database should be UTF-8, because all other character sets can be converted to it.
The optimum solution is only using the UTF-8 character set throughout EE, because no conversions will occur, whatever you do.
To achieve that, you have to convert all your language files to UTF-8 (e.g. with linux for language files currently using iso-8859-1 with "iconv -f iso-8859-1 -t utf-8 oldfile >newfile") and add a mysql_query("SET NAMES 'UTF8'"); statement in the basecode/SCRIPT_connect.php file to connect to the database using UTF-8.
If all data is stored correctly in your database (i. e. all your languages are using iso-8859-1), you can simply convert the tables and fields to utf-8 using phpmyadmin or mysql statements, but if some data already is stored incorrectly, it gets tricky.
You have to export those tables with phpmyadmin, mysqldump or some other tool, use iconv to undo the wrong conversion. e. g. 'iconv -f utf-8 -t iso-8859-1 sqldump.sql >new.sql'. After that you have to replace the character set used in the sql statement to create tables to the new (UTF-8) character set. Using linux you can do this with sed, e. g. "sed -i 's/DEFAULT CHARSET=latin1/DEFAULT CHARSET=utf8/g' new.sql"
Take a look at the file before, to see what is currently used.
After that, drop those tables from the database, and import the changed file using mysql or phpmyadmin.
Doesn't sound easy, does it?
I had to do the same before with an old geeklog CMS installation, so I already had some experience in that field.
As my EE 1.5RC4 installation is already converted UTF-8, a won't have to convert the database, only the files.
If there's interest, I could write a detailed HOW-TO.
Maybe a future EE upgrade will do that automatically ;-)
UweB
26th of April 2007 (Thu), 05:35
Hello MMCM, Martin,
thank's for your instructions which i have to read carefully again before i do anything ;-)
i just looked into phpmyadmin...
What i found is the default DB settings which look like this:
* Server version: 4.1.21-standard
* Protocol version: 10
* Server: Localhost via UNIX socket
* User: xxxxxx@xxxxxx
* MySQL charset: UTF-8 Unicode (utf8)
* MySQL connection collation: utf8_unicode_ci
In the EE2 DB in the table ee_admin (for example) it looks like this:
field type collation
ee_admin_name varchar(32) latin1_swedish_ci
as it does in all other EE2 tables, i.e. always latin1_swedish_ci. Does this mean that the default MySQL setting is UTF8 but the EE2 DB is set to latin1_swedish?
The cyrillic text in the tables is not readable (even after changing the caracter encoding in the browser) with phpmyadmin but the german and english text is.
I tried to create/save/use the russian language files in ANSI, unicode, unicode big endian and utf-8 but in the end only utf-8 worked in the browser. So this is the format i finally saved and published (for download) them...
I agree that something automatic should be build into one of the next EE upgrades so that it works with UTF-8...
MMCM
26th of April 2007 (Thu), 09:21
I found the following information:
The UTF-8 representation of the Unicode character 160 (non-breaking-space) is a pair of two bytes 0xc2, 0xa0.
That's why the single 0xa0 will not work when displayed with UTF-8.
You could replace the 0xa0 character with "& #160;" (remove blank, I could not enter this without a blank, because it would be translated in the post) instead of " ".
"& #160;" has the advantage, that it's recognised by XML-parsers, which do not recognize " ". But that's not really important with EE.
Cyclist
13th of July 2008 (Sun), 11:33
The optimum solution is only using the UTF-8 character set throughout EE, because no conversions will occur, whatever you do.
To achieve that, you have to convert all your language files to UTF-8 (e.g. with linux for language files currently using iso-8859-1 with "iconv -f iso-8859-1 -t utf-8 oldfile >newfile") and add a mysql_query("SET NAMES 'UTF8'"); statement in the basecode/SCRIPT_connect.php file to connect to the database using UTF-8.
How do I need to change the basecode/SCRIPT_connect.php file exactly? I have no idea where to put the mysql_query("SET NAMES 'UTF8'") statement.
Any help would be appreciated. Thanks.
MMCM
13th of July 2008 (Sun), 16:37
How do I need to change the basecode/SCRIPT_connect.php file exactly? I have no idea where to put the mysql_query("SET NAMES 'UTF8'") statement.
Any help would be appreciated. Thanks.
<?php
error_reporting(0);
include ("toroot.php");
if (!function_exists("mysql_connect")) {
print "<p>FATAL ERROR: MySQL support not installed for PHP!";
}
$ee_mysql_connection = @mysql_connect($servername, $username, $password);
if (!$ee_mysql_connection) {
sleep (1);
$ee_mysql_connection = @mysql_connect($servername, $username, $password);
if (!$ee_mysql_connection) {
sleep (2);
$ee_mysql_connection = @mysql_connect($servername, $username, $password);
if (!$ee_mysql_connection) {
sleep (5);
$ee_mysql_connection = @mysql_connect($servername, $username, $password);
if (!$ee_mysql_connection) {
?>
<html><body><table width="100%" border="0" cellpadding="35"><tr><td>
<p style="font-family: 'Trebuchet MS', Verdana, Geneva, Arial, Helvetica, sans-serif; font-size: 12px;"><big style="font-family: 'Trebuchet MS', Verdana, Geneva, Arial, Helvetica
, sans-serif; font-size: 26px;">Unable to connect to the photo database server</big>
<br>This error message means the database server is offline or can not be found. <br>It could also also mean that the database server is overloaded.
<br><br><b style="font-family: 'Trebuchet MS', Verdana, Geneva, Arial, Helvetica, sans-serif; font-size: 14px;">PLEASE PRESS REFRESH ON YOUR BROWSER TO TRY AGAIN.</b></p></td></t
r></table>
</body></html>
<?php
exit();
}
}
}
}
mysql_query("SET NAMES 'UTF8'"); // Need to add this for UTF-8
if (!@mysql_select_db($databasename)) {
?>
<html><body><table width="100%" border="0" cellpadding="35"><tr><td>
<p style="font-family: 'Trebuchet MS', Verdana, Geneva, Arial, Helvetica, sans-serif; font-size: 12px;"><big style="font-family: 'Trebuchet MS', Verdana, Geneva, Arial, Helvetica
, sans-serif; font-size: 26px;">Unable to connect to the photo database</big>
<br>This error message means the database can not be accessed. <br>It could also also mean that the database server is overloaded.
<br><br><b style="font-family: 'Trebuchet MS', Verdana, Geneva, Arial, Helvetica, sans-serif; font-size: 14px;">PLEASE PRESS REFRESH ON YOUR BROWSER TO TRY AGAIN.</b></p></td></t
r></table>
</body></html>
<?php
exit();
}
.
.
.
Cyclist
15th of July 2008 (Tue), 04:52
Thanks for your help, MMCM. Finally all the content seems to be saved as utf-8. But I think this should be standard if setting the charset to utf-8. Maybe this can be done with the next release of EE (Pekka?).
Pekka
16th of July 2008 (Wed), 08:06
Because PHP is not yet fully UTF-8 -ready (http://www.phpwact.org/php/i18n/utf-8) and converting database having mixed charsets to UTF-8 is an error-prone task, I'm not too eager to go to UTF-8 yet. But when EE2 goes Open Source I'm sure someone who knows all about character sets will code a converter that does it safely and neatly :)
I think the only way to do it right would be to use info in http://www.mysqlperformanceblog.com/2007/12/18/fixing-column-encoding-mess-in-mysql/ and write a script that would do the conversion one row at a time.
Cyclist
16th of July 2008 (Wed), 13:41
Because PHP is not yet fully UTF-8 -ready (http://www.phpwact.org/php/i18n/utf-8) and converting database having mixed charsets to UTF-8 is an error-prone task, I'm not too eager to go to UTF-8 yet. But when EE2 goes Open Source I'm sure someone who knows all about character sets will code a converter that does it safely and neatly :)
I think the only way to do it right would be to use info in http://www.mysqlperformanceblog.com/2007/12/18/fixing-column-encoding-mess-in-mysql/ and write a script that would do the conversion one row at a time.
I don't really understand your point. Utf-8 ist a quasi standard for most multilingual applications/scripts. Although it might not be perfect I think it's far better than using the old iso standard. But that's depends on the point of view, of course.
EE2 offers the possibilty to use different charsets, you can set them in the settings. Therefore I think EE should fully support those Utf-8. I hope I am getting you wrong and won't run into big problems when the next upgrade (with change in the database structure might be neccessary).
Pekka
16th of July 2008 (Wed), 14:43
What I talk about is not bashing UTF-8 but about difficulty converting old databases which might have mixed character sets during years. Also, every string handling code in EE2 has not been tested to work with various versions of PHP combined with UTF-8 database on various versions of MySQL.
That is why I can not just do it, it needs research and testing.
Many applications go right away to latest standards, I need EE to work with MySQL 4 and older PHP's, too. MySQL 4.1 is the first one with Unicode support. With html entities all works just fine.
289004
Cyclist
16th of July 2008 (Wed), 15:35
I understand the problem with old datbase but my question is the other way round. I have already converted my database and all the necessary files to utf-8 since EE allows the utf-8 charset. After what you wrote I am a bit concerned what will happen with the next upgrade which probably will change/add some tables to the database. Right now everything is running in utf-8 without any problems. Iam able to store and display all the data as utf-8. But will a upgrade change the database (utf-8) back to iso and/or will it cause other problems?
Pekka
16th of July 2008 (Wed), 16:10
Upgrades will not touch existing collations. But next update will have setting for connection charset like
$s_connection_charset = "UTF-8"; // default ISO-8859-1
and maybe some other stuff to configure, so when you upgrade you just need to set those right.
Have you converted ALL your EE2 tables to UTF-8?
Cyclist
16th of July 2008 (Wed), 16:41
Upgrades will not touch existing collations. But next update will have setting for connection charset like
$s_connection_charset = "UTF-8"; // default ISO-8859-1
and maybe some other stuff to configure, so when you upgrade you just need to set those right.
Thanks for the info.
Have you converted ALL your EE2 tables to UTF-8?
For converting I used the script which was linked in one of the utf-8 threads and converted the tables. So far I haven't had any problems. It seems to work perfectly.
Pekka
17th of July 2008 (Thu), 16:29
T
For converting I used the script which was linked in one of the utf-8 threads and converted the tables. So far I haven't had any problems. It seems to work perfectly.
Can you tell me where is that script you used so that I can take a look?
I have written now a converter which converts high ISO characters to Numeric Character References (else you get ? for ÄÅÖÐÇ etc after straight table conversion) and then converts tables to UTF-8 and saves NCR data which MySQL stores as UTF-8. Seems to work ok. I think I'll have it as a separate "UTF-8 Converter" which does all the work in correct order.
Cyclist
17th of July 2008 (Thu), 17:06
I don't really remember which script I exactly used but i have looked into my bookmark so it might have been one of those:
http://www.phpwact.org/php/i18n/utf-8/mysql
http://www.v-nessa.net/2007/12/06/convert-database-to-utf-8
Before converting I did two backups of the content of my database. One backup I converted into utf-8. Then I converted the database with all the tables to utf-8. I inserted my UTF-8 backup. Then I just converted the templates files, language files ( I don't rember if there were even more..) to utf-8 and set the mysql connection to utf-8. That was it. Since I haven't had much multilingual content at the time of converting I can't say how it would work with a huge database. For me it worked fine the way I did it.
vBulletin® v3.6.12, Copyright ©2000-2012, Jelsoft Enterprises Ltd.