Making WordPress.org

Opened 11 years ago

Closed 9 years ago

Last modified 9 years ago

#117 closed defect (bug) (fixed)

Credits API character set

Reported by: knutsp's profile knutsp Owned by: coffee2code's profile coffee2code
Milestone: Priority: normal
Component: API Keywords:
Cc:

Description

credits.php doesn't show (translators) names with non-ASCII characters properly. Looks ugly. See http://core.trac.wordpress.org/ticket/17861

Attachments (1)

117.diff (1.7 KB) - added by coffee2code 9 years ago.
Patch for wp-admin/credits.php to properly display characters sent via Credits API.

Download all attachments as: .zip

Change History (25)

#1 @SergeyBiryukov
11 years ago

  • Summary changed from Credtis API character set to Credits API character set

#3 @SergeyBiryukov
11 years ago

Looks like only Browse Happy and importers API are open sourced at the moment.

#4 follow-up: @nacin
11 years ago

I'm happy to open-source this API. Currently it relies on the hardcoded email addresses for Gravatars, so I'd have to change that.

#5 @knutsp
11 years ago

  • Priority changed from low to normal

Any traction on this?

Still contributors to core are not honoured on credits.php with their name displayed correctly, when their names have some non-ASCII characters. Isn't it quite embarrassing? Something, somewhere is not UTF-8 clean, or?

#6 @SergeyBiryukov
11 years ago

Confirmed the issue with validators on a nb_NO intall. The screenshot from #wp17861 is still relevant: https://core.trac.wordpress.org/raw-attachment/ticket/17861/wplang.png

#7 @knutsp
9 years ago

Anyone to tackle this?

Internationalization has high priority, right?

#8 in reply to: ↑ 4 @knutsp
9 years ago

Replying to nacin:

I'm happy to open-source this API.

@nacin, any progress on this?

It really doesn't look good for WordPress, so it should be fixed.

#10 follow-up: @knutsp
9 years ago

Thank you for the quick answer, Nacin.

So, it's open source now. To suggest a fix for this you only need to know how. I would like to try creating a patch, but I would need some guidance. I know how to create a patch for core. But this? And how to test that my patch works? Would I then need a fork of api.wordpress.org set up somewhere?

I was very close to ask a question about this to @matt in the Q&A session at WCEU, but was too shy and slow to bring it up in time before the last question. My bad. Now I have try try here, to get some attention to this bug.

Anyone else I could mention here, who would be able to pick it up, or give me a hint on where to start?

#11 in reply to: ↑ 10 @netweb
9 years ago

Replying to knutsp:

Thank you for the quick answer, Nacin.

So, it's open source now. To suggest a fix for this you only need to know how. I would like to try creating a patch, but I would need some guidance. I know how to create a patch for core. But this? And how to test that my patch works? Would I then need a fork of api.wordpress.org set up somewhere?

Typically for "meta" following this guide in the handbook to get started is normally all that is needed:
https://make.wordpress.org/meta/handbook/about/get-involved/setting-up-your-machine/

Though api.wordpress.org has yet to be added to the WordPress Meta Environment :(
https://github.com/iandunn/wordpress-meta-environment/blob/master/TODO.md

So ideally adding api.wordpress.org to the WordPress Meta Environment would be the best start, though I'm not sure what has yet to be open sourced from https://meta.svn.wordpress.org/sites/trunk/api.wordpress.org and @iandunn would be the best person to offer some feedback here on the best way forward here.

@coffee2code
9 years ago

Patch for wp-admin/credits.php to properly display characters sent via Credits API.

#12 follow-ups: @coffee2code
9 years ago

The presumption in closing #core17861 and moving discussion to this ticket has been that the Credits API is at fault. As a proper solution it could well be.

However, there is a solution to the problem by patching core. Attached is 117.diff which patches wp-admin/credits.php.

The crux of the fix is that the Credits API applies utf8_encode() on user display_names sent via the API (see 1, 2, 3). So if wp-admin/credits.php simply does utf8_decode() on the string it receives, the proper characters are displayed (at least for the ones I tested).

The easiest test is to apply the patch and view the credits for WP 4.3. In the props list, you should see Bjørn Johansen. Before the patch, that would appear as Bjørn Johansen. (He is also a translator for the locale nb_NO, and the fix applies to that section as well.)

For the locale Norsk nynorsk (nn_NO), Eivind Ødegard correctly displays as Eivind Ødegard.

I tried a few other locales with similar improvements.

Please test and see if it resolves the character display issues you've seen with credits. If so, we might reopen #core17861 and close this.

#13 @Otto42
9 years ago

Is this a case where we should not be utf8_encoding the data in the credits API? How is the data stored in the database? It might be a case where we're storing it as utf8 to begin with, perhaps, so encoding it is the wrong thing to do here.

#14 @coffee2code
9 years ago

I'm not sure about the justification for how it's being handled by the credits API.

The data in question is the user data (specifically display_name) from the global users table, so there's that familiar issue (latin1 table).

But yeah, it seems we could alternatively remove the use of utf8_encode() in the credits API and things would work as well. I'm not sure if there are unforeseen ramifications of doing so. The use of remove_accents() doesn't appear as though it can be omitted, so names still wouldn't be perfect, but at least they would look much better.

#15 in reply to: ↑ 12 @knutsp
9 years ago

117.diff works here, but it seems odd to have to decode something that shouldn't have been encoded in the first place. Such API should deliver the data properly for easy consumption by other clients than core credits.php

Happy to see this moving forward.

#16 @Otto42
9 years ago

The data in question is the user data (specifically display_name) from the global users table, so there's that familiar issue (latin1 table).

Yeah, that is the thing about w.org tables. Just because it lives in a latin1 column does not necessarily mean it actually is latin1. We've had these tables a long, long time, and the data might be utf8 stored in latin1 and no conversion ever performed.

I think we need a practical approach here. If all the existing data work fine without the encode, then dump the encode and insta-fix all the existing installs. If we need special handling for some, then we can special handle those. Long term we convert the lot, but long is super-long as you well know.

#17 in reply to: ↑ 12 ; follow-up: @netweb
9 years ago

Replying to coffee2code:

The easiest test is to apply the patch and view the credits for WP 4.3. In the props list, you should see Bjørn Johansen. Before the patch, that would appear as Bjørn Johansen. (He is also a translator for the locale nb_NO, and the fix applies to that section as well.)

https://make.wordpress.org/polyglots/teams/?locale=nb_NO

For the locale Norsk nynorsk (nn_NO), Eivind Ødegard correctly displays as Eivind Ødegard.

https://make.wordpress.org/polyglots/teams/?locale=nn_NO

Just adding an additional use case of the "shared user" table used by polyglots and that both Bjørn and Eivind "display names" currently display as expected on the respective polyglots teams pages linked above.

The source for the above: /wordpress.org/public_html/wp-content/plugins/wp-i18n-teams/views/locale-details.php#L69

Another vote for the "dump the encode and insta-fix" here as the above never encodes them in the first place.

#18 in reply to: ↑ 17 @knutsp
9 years ago

Replying to netweb:

Another vote for the "dump the encode and insta-fix" here as the above never encodes them in the first place.

I see. The data are actually stored as UTF8 despite the table charset. We all know MySQL doesn't care when charset is Latin1, it takes anything, like binary. So it's wrong to assume actual Latin1 encoding just because data is stored in such table/column.

Remove the encoding function at get an instant fix at the same time.

#19 @coffee2code
9 years ago

  • Owner set to coffee2code
  • Resolution set to fixed
  • Status changed from new to closed

In 1878:

Credits API: Don't unnecessarily utf8_encode() user display_name.

Fixes #117.

#20 @coffee2code
9 years ago

FYI: Could take up to 12 hours for caches to clear for this to be fully reflected.

#21 @knutsp
9 years ago

Wonderful. Thank you, @coffee2code and all who chimed in on this.

#22 @bjornjohansen
9 years ago

I am so happy to don’t cringe and feel embarrassed every time I see the credits screen now.

Thank you everyone involved :-)

#23 @knutsp
9 years ago

  • Keywords needs-patch removed

I'm sure @isthoresen, "Ingebjorg", will soon revert her display name back to the correct "Ingebjørg", thanks to this.

#24 @ocean90
9 years ago

#128 was marked as a duplicate.

Note: See TracTickets for help on using tickets.