Making WordPress.org

Opened 11 years ago

Closed 11 years ago

#257 closed defect (bug) (fixed)

Encoding issues in Trac

Reported by: sergeybiryukov's profile SergeyBiryukov Owned by:
Milestone: Priority: normal
Component: Trac Keywords:
Cc:

Description (last modified by SergeyBiryukov)

Looks like moving to MySQL in #253 introduced some encoding issues.

See weird characters in wp19156:2 (also note the owner name in that ticket), wp21366:3, wp25183:6, wp25669:7.

Summarized in a screenshot. They were all previously displayed correctly. There are probably more, these are just the ones I could find at a glance.

Happy to help with debugging or data conversion if needed.

Attachments (1)

257.png (29.5 KB) - added by SergeyBiryukov 11 years ago.

Download all attachments as: .zip

Change History (10)

@SergeyBiryukov
11 years ago

#1 @SergeyBiryukov
11 years ago

I was able to reproduce the same characters by treating UTF-8 data as ISO 8859-1.

#2 @SergeyBiryukov
11 years ago

You can use http://2cyr.com/decode/ to get an idea of what the readable text should look like. Autodetect option works for me, but there are still some invalid sequences (displayed as question marks) due to non-printable characters not being copied correctly.

#3 @SergeyBiryukov
11 years ago

  • Description modified (diff)

#4 @nacin
11 years ago

Yeah, ocean90 also reported this to me. So the good thing is, the DB is correct. Querying the DB directly for any of these gives me proper results. So this appears to be either a connection character set issue or some encoding issue within Trac. I'll dig into this tomorrow and will provide a DB dump of some affected tickets.

#5 @SergeyBiryukov
11 years ago

  • Description modified (diff)

#6 @nacin
11 years ago

Barry figured out what the issue is here (double-encoding). We'll be able to fix this pretty easily; just working on a script for it.

#7 follow-up: @nacin
11 years ago

Note to self (when I fix this, soon) - make sure https://core.trac.wordpress.org/ticket/14647 works.

#8 in reply to: ↑ 7 @nacin
11 years ago

Replying to nacin:

Note to self (when I fix this, soon) - make sure https://core.trac.wordpress.org/ticket/14647 works.

Ah, this was unrelated. Trac got re-compiled when I applied the core patch from #127. This upgraded Genshi (Trac's templating engine) to 0.7, which we had previously downgraded to 0.6.1 for this exact reason.

#9 @nacin
11 years ago

  • Resolution set to fixed
  • Status changed from new to closed

I fixed all of these. Hat-tip to the convert(cast(convert(field using latin1) as binary) using utf8) trick. Was careful to not run it on any post-migration fields as that would cause problems.

Note: See TracTickets for help on using tickets.