Opened 4 years ago
Closed 4 years ago
#5117 closed defect (bug) (fixed)
WP dashboard city locations (encoding?) wrong
Reported by: | ramon fincken | Owned by: | |
---|---|---|---|
Milestone: | Priority: | normal | |
Component: | Events API | Keywords: | has-screenshots |
Cc: |
Description
as per Pascal Birchler's tweet .. Zürich is misspelled.
I tried this with Amsterdam and it came out wrong as well ( Àmsterdam ).
It appears that the core API at api.wordpress.org is sending out wrong data. For Amsterdam it is not even a wrong encoding as the "A" should just be an uppercase A, nothing fancy.
I would love to fix this and I will check on slack and Make where the api is build.
See attached images
Attachments (5)
Change History (13)
This ticket was mentioned in Slack in #core by ramonfincken. View the logs.
4 years ago
#4
@
4 years ago
The problem appears to lie in the summary table that we're creating for the purposes of faster searching. It's getting the name from there, but the name there is not the canonical version of what we should be returning.
So, for example, the summary table has Amsterdam, Àmsterdam, and Àmsterdam in it. It also has: AMS,Amesterdam,Amesterdao,Amesterdão,Amistardam,Amstardam,Amstedam,Amstelodamum,Amsterdam,Amsterdama,Amsterdamas,Amsterdami,Amsterdamo,Amsterdams,Amsterdan,Amsterntam,Amstèdam,Amszterdam,Damsko,Gorad Amstehrdam,I-Amsterdami,Mokum,a mu si te dan,aimstardaima,amasataradama,amastaradama... and so on. You get the idea.
Basically, every alternate name exists in that table for fast searching.
Now, searching on name=Amsterdam in the summary table comes back with five rows, and it essentially picks the first one it sees, which happens to have the grave in this one.
I'm not 100% sure on this, but I think this happens because the table has the latin1_general_ci collation set for it. Think the table needs a restructure to be fully utf8mb4 for all rows and the collation.
#5
@
4 years ago
- Keywords needs-patch removed
- Milestone Q2 deleted
I'm not 100% sure on this, but I think this happens because the table has the latin1_general_ci collation set for it. Think the table needs a restructure to be fully utf8mb4 for all rows and the collation.
The table collation is set to latin1 (as all of WordPress.org should be) but the data is utf8 and treated as such since the connection to the table should be utf8 (When using not-the-events-api to query it, that needs to be set manually)
It looks like this is a MySQL ordering issue, adding an exact-match in the ordering seems to result in the correct row being returned, I suspect this wasn't picked up at the time of building since the row orderings on disk in the table have since changed, previously the first result would've been the correct, but now the correct row is a few results in.
Commit incoming, the Caches on location lookups is 12hours, so if everyone could recheck the API responses after that time, it'd be appreciated to check that everything is still responding as intended. That goes for more than just Zurich/Amsterdam, "normal" things that were previously correct should also be checked. Keep in mind that the WordPress Timezone and Locale are also taken into account, so even if it was working fine for someone, double check it.
Zürich