#5874 closed defect (bug) (duplicate)
Database character set default configuration
Reported by: | anrghg | Owned by: | |
---|---|---|---|
Milestone: | Priority: | high | |
Component: | General | Keywords: | has-patch |
Cc: |
Description
The file wp-config-sample.php suggests to configure the database charset to utf8
. The origin of the file seems unknown, its traceability starts 8 years ago, but it most probably predates the year 2010 and was not updated when MySQL released support for actual UTF-8 in 2010.
In MySQL, utf8
is a proprietary charset that only supports the Basic Multilingual Plane of Unicode, as it is limited to 3 bytes, but in UTF-8, characters from U+10000 on take up 4 bytes (starting at F0 90 80 80).
So in MySQL, utf8
is a misnomer for utf8mb3
, while the real UTF-8 encoding, that utf8
should be but is not, is labeled utf8mb4
.
Attachments (1)
Change History (5)
#2
@
3 years ago
Also, this is the meta.trac. Issues with core should be made in core.trac.wordpress.org instead.
Note: See
TracTickets for help on using
tickets.
There is a function to determine the best charset/collation.
See https://github.com/WordPress/WordPress/blob/3d623995a8d070e608f4ff297512799cf89bb2c0/wp-includes/wp-db.php#L757-L764
So it is ok as it is.