WordPress.org

Making WordPress.org

Opened 6 years ago

Closed 5 years ago

#257 closed defect (fixed)

Encoding issues in Trac

Reported by: SergeyBiryukov Owned by:
Milestone: Priority: normal
Component: Trac Keywords:
Cc:

Description (last modified by SergeyBiryukov)

Looks like moving to MySQL in #253 introduced some encoding issues.

See weird characters in wp19156:2 (also note the owner name in that ticket), wp21366:3, wp25183:6, wp25669:7.

Summarized in a screenshot. They were all previously displayed correctly. There are probably more, these are just the ones I could find at a glance.

Happy to help with debugging or data conversion if needed.

Attachments (1)

257.png (29.5 KB) - added by SergeyBiryukov 6 years ago.

Download all attachments as: .zip

Change History (10)

@SergeyBiryukov
6 years ago

#1 @SergeyBiryukov
6 years ago

I was able to reproduce the same characters by treating UTF-8 data as ISO 8859-1.

#2 @SergeyBiryukov
6 years ago

You can use http://2cyr.com/decode/ to get an idea of what the readable text should look like. Autodetect option works for me, but there are still some invalid sequences (displayed as question marks) due to non-printable characters not being copied correctly.

#3 @SergeyBiryukov
6 years ago

  • Description modified (diff)

#4 @nacin
6 years ago

Yeah, ocean90 also reported this to me. So the good thing is, the DB is correct. Querying the DB directly for any of these gives me proper results. So this appears to be either a connection character set issue or some encoding issue within Trac. I'll dig into this tomorrow and will provide a DB dump of some affected tickets.

#5 @SergeyBiryukov
6 years ago

  • Description modified (diff)

#6 @nacin
6 years ago

Barry figured out what the issue is here (double-encoding). We'll be able to fix this pretty easily; just working on a script for it.

#7 follow-up: @nacin
6 years ago

Note to self (when I fix this, soon) - make sure https://core.trac.wordpress.org/ticket/14647 works.

#8 in reply to: ↑ 7 @nacin
6 years ago

Replying to nacin:

Note to self (when I fix this, soon) - make sure https://core.trac.wordpress.org/ticket/14647 works.

Ah, this was unrelated. Trac got re-compiled when I applied the core patch from #127. This upgraded Genshi (Trac's templating engine) to 0.7, which we had previously downgraded to 0.6.1 for this exact reason.

#9 @nacin
5 years ago

  • Resolution set to fixed
  • Status changed from new to closed

I fixed all of these. Hat-tip to the convert(cast(convert(field using latin1) as binary) using utf8) trick. Was careful to not run it on any post-migration fields as that would cause problems.

Note: See TracTickets for help on using tickets.