#8253 closed defect (bug) (fixed)
Credits: character encoding for names
| Reported by: |
|
Owned by: |
|
|---|---|---|---|
| Milestone: | Priority: | normal | |
| Component: | API | Keywords: | |
| Cc: |
Description
(Reported in #core65245)
The following characters display incorrectly in about 20 names on the credits.php page:
á ä é í ó ö ñ ú ü ý Ž
@clementpolito:
On WordPress 7.0-RC4, on the page /wp-admin/credits.php, I encounter an issue with some characters.
At first, I thought there might be a character encoding issue with my database tables on my end, but in fact, some accented characters are displaying correctly elsewhere on the page. And @audrasjb reproduce the issue.
- "Albert Juhテゥ Lluveras" should be "Albert Juhé Lluveras"
- "Alvaro Gテウmez" should be "Alvaro Gómez"
- "Béryl de La Grandière" is ok
- "Eliezer Peテアa" should be "Eliezer Peña"
- "Johannes Jテシlg" should be "Johannes Jülg"
- Etc.
screenshot of core contributors list on 6.9 credits page
@ocean90:
this looks more like an issue on WordPress.org since the API already returns the incorrect encoding, see https://api.wordpress.org/core/credits/1.1/?version=6.9.
The profiles, e.g. https://profiles.wordpress.org/aljullu/ and https://profiles.wordpress.org/anlino/, are looking ok though.
@jonsurrell:
This line looks suspicious. It hasn't changed in a long time, but changes to underlying language data or functionality could plausibly produce different results. In particular, listing
JISbeforeUTF-8in the from encoding seems problematic. Maybe the conversion can be dropped completely if the data is already UTF-8.
<?php $raw = 'é1234567890'; echo mb_convert_encoding($raw, 'UTF-8', 'ASCII, JIS, UTF-8, Windows-1252, ISO-8859-1') . "\n"; // テゥ1234567890 echo mb_convert_encoding($raw, 'UTF-8', 'ASCII, UTF-8, JIS, Windows-1252, ISO-8859-1') . "\n"; // é1234567890
@siliconforks:
It looks like the behavior changed in PHP 8.3:
<?php $raw = 'é1234567890'; $raw = mb_convert_encoding( $raw, 'UTF-8', 'ASCII, JIS, UTF-8, Windows-1252, ISO-8859-1' ); // PHP 8.2: é1234567890 // PHP 8.3: テゥ1234567890 echo $raw . "\n";
Change History (7)
This ticket was mentioned in Slack in #core by jorbin. View the logs.
4 days ago
#3
@
4 days ago
This also affects the WP release announcements, e.g. https://wordpress.org/news/2026/05/armstrong/ or https://wordpress.org/news/2025/12/gene/ so definitely an API thing.
#4
@
4 days ago
Can confirm, this is a PHP 8.3 change in the Multibyte detection.
The reason for this _encode() method is that historically WordPress.org had some contributors names malformed in the users table, due to BuddyPress writing data into the users table with the incorrect connection charset/collation.
The DB writes have been fixed over the years, and this encode continued to work, until we switched to PHP 8.4 in the last few days from PHP 8.1 which seems to have resulted in a range of characters being detected as JIS. JIS was never correct here, but seemingly worked well enough for the original problem of UTF8 characters stored into a Latin1 table via a UTF8 charset connection, that was then read as UTF8-in-latin1 via latin1 charset connection..
I set claude loose on a single specific username:
What's happening with Toni Viemerö:
- DB returns clean UTF-8: bytes 54 6F … C3 B6 (the C3 B6 is ö).
- mb_convert_encoding() auto-detects from the list in order, and the UTF-8 bytes C3 B6 are also valid JIS X 0201 (single-byte halfwidth katakana テ + カ).
- Because JIS is listed before UTF-8, PHP 8.4's mb_detect_encoding picks JIS and "converts" the bytes — corrupting ö into テカ (U+FF83 U+FF76).
- json_encode then emits "Toni Viemer\uff83\uff76".
I've had Claude audit all other users, and found 22 users who needed this code still. I've now re-saved those users profiles, and the users table is correct, resulting in:
17,733 distinct credited users examined
- 17,015 pure ASCII display_names (96%)
- 458 valid UTF-8 with non-ASCII (these get garbled by the current mb_convert_encoding JIS bug)
- 22 display_names that are invalid UTF-8 — These are now fixed
#5
@
4 days ago
- Owner set to dd32
- Resolution set to fixed
- Status changed from new to closed
In 14904:
#7
@
3 days ago
Follow-up ticket for other places with this issue: https://meta.trac.wordpress.org/ticket/8254
Seems to be related to [14898/sites/trunk/api.wordpress.org/public_html/core/credits/wp-credits.php].
/cc @dd32