Making WordPress.org

Opened 8 years ago

Closed 7 years ago

Last modified 7 years ago

#1691 closed task (blessed) (fixed)

Internationalisation

Reported by: dd32's profile dd32 Owned by:
Milestone: Priority: high
Component: Plugin Directory Keywords:
Cc:

Description (last modified by dd32)

WordPress & WordPress.org are not just in American English, in order to fully bring the Plugin Directory to the masses we need to have a properly localised site and content.

Part of this also hooks into #1496 & #1574 - We need to make the translated content available to Elastic Search.
In order to do that, we need to store the translated plugin data into WordPress as well (Instead of translating-on-the-fly like the existing Plugin Directory does, and how the Themes Directory does it).

Tasks:

  • Have all directory strings marked for translation
  • Present Translated plugin data on views
  • Have that Translated plugin data available for seamless searching
  • Trigger the GlotPress code imports, which requires creating the .pot's for both the readme and plugin code.

Attachments (6)

plugin-import.jpeg (609.4 KB) - added by ocean90 8 years ago.
plugin-front-end.jpeg (440.6 KB) - added by ocean90 8 years ago.
1691.patch (4.4 KB) - added by ocean90 8 years ago.
1691.2.patch (5.4 KB) - added by ocean90 8 years ago.
1691.3.patch (6.8 KB) - added by ocean90 8 years ago.
class-plugin-i18n.diff (13.0 KB) - added by tellyworth 8 years ago.

Download all attachments as: .zip

Change History (55)

#1 @dd32
8 years ago

  • Type changed from defect to task

#2 @dd32
8 years ago

  • Description modified (diff)
  • Priority changed from normal to high

#3 @ocean90
8 years ago

Let's start with status quo:

  • We have four systems: bbPress, SVN tracker, WordPress with GlotPress, and Slack.
    • bbPress inits the SVN tracker.
    • SVN tracker checks revision numbers starts the build process, on each web node (5)
    • web1 takes care of updating the data in bbPress and that's where the i18n process is hooked into too.
  • The i18n process is split into two parts: process_code and process_readme
    • process_readme runs if the changeset includes a change to a readme file.
    • process_code runs if the changeset includes more than a readme change.
    • If a changeset includes both then both processes (code first) run.
  • In GlotPress we have four projects for each plugin:
    • dev - Code project for trunk
    • stable - Code project for the stable_tag
    • dev-readme - Redme project for the readme.(md|txt) in trunk
    • stable-readme - Readme project for the readme.(md|txt) in stable_tag
  • The i18n process is smart enough to only update the dev project when something was changed in the trunk directory.
  • For the rest of the import see plugin-import.jpeg.
  • For the front end we have several caches which is the reasons why none of the translated data is searchable. The graphic plugin-front-end.jpeg makes this clear.
  • Changes to readme translations are live immediately.

The future:

Both graphics include some ⚠️ for a few steps. These should be resolved with the new directory. The front end has the biggest one. Some ideas on how we could solve this:

  • Store translated data in post meta:
    • For example title_de_DE holds the title in German, short_description_it_IT holds the short description in Italian. But that doesn't scale well IMO. We've around 70 active locales but the system should be designed to work with all locales, ~170 currently.
    • Let's say we have 5 meta fields: 170x5 = 850 fields for one plugin. 850 * 44k plugins = too many. ;-)
  • Store translated data in another post type:
    • For example the post type plugin_de_de includes all the translated data into German. (Note: The name of a post type is limited to 20 chars. Maybe use an ID instead: plugin_1 with 1 = de_DE or plugin_2 with 2 = de_DE_formal.)
    • For the relationship we can use Post 2 Post.
    • Using a post type doesn't require to do bulk updates like mentioned in #1692 and fits with the idea to have something like content_de.

#4 @dd32
8 years ago

bbPress inits the SVN tracker.

This is the major part of the system which hasn't yet been written, I've started a few things on this, but nothing solid that we can use.
My current thoughts is that we should be building this upon a cron system like Cavalcade, which would make the process far simpler and reduce the reliance upon a single server to process plugin updates.

However, I don't expect that's going to be an option for us to start with, so I expect we'll need to build a script similar to bbPress's which is called every minute and processes svn revs from $last to HEAD triggering plugin imports to the directory, ZIP Invalidation/rebuilding where needed, and finally triggering GlotPress imports via .pot files.

Store translated data in another post type:

This is the direction I was initially going to suggest going with, or perhaps storing it in a sub-page. For example, /plugin-name/ and /plugin-name/de_DE/ would exist, the latter being in a different post_type - we'd transparently switch out the translated post with the main post object on the fly as needed.
No matter what option is chosen, there's going to be a lot more data stored in the DB which isn't really an issue, the biggest issue is the searching, having some way to return the right information and not duplicating results between a specific locale & english, etc.

@tellyworth has been looking at ways to get the translated data into search. Last I heard was that re-using the translate-on-the-fly that bbPress is currently doing is the easiest option for display, and then just pushing the translated data into ElasticSearch in other ways

#5 @ocean90
8 years ago

or perhaps storing it in a sub-page.

That's a third option I hadn't thought of until now. Not sure why it needs another CPT, why can't it be the same?

Last I heard was that re-using the translate-on-the-fly that bbPress is currently doing is the easiest option for display

I'm pretty against re-using that for the new directory. Nothing should be stored in a custom object cache. Let's make use of WP's own caching behavior for posts, etc. And let's not make GlotPress another source for Elasticsearch.
If you want I can start with a POC for storing the translated data in sub-page.

#6 @dd32
8 years ago

That's a third option I hadn't thought of until now. Not sure why it needs another CPT, why can't it be the same?

My only reason for storing it in it's own CPT would be to have a simpler way to differentiate between main-post-object and translated-post-object.

Last I heard was that re-using the translate-on-the-fly that bbPress is currently doing is the easiest option for display

I'm pretty against re-using that for the new directory.

I don't like it either, but keeping the status-quo where no better solution has been determined isn't a step backwards.

#7 @dd32
8 years ago

Also, I'd love to see someone do a POC, just to be clear; I'm not working on this.

#8 @ocean90
8 years ago

  • Owner set to ocean90
  • Status changed from new to reviewing

This ticket was mentioned in Slack in #meta by obenland. View the logs.


8 years ago

#10 @obenland
8 years ago

  • Milestone set to Plugin Directory v3 - M5

@ocean90
8 years ago

@ocean90
8 years ago

@ocean90
8 years ago

#11 @ocean90
8 years ago

1691.3.patch is a first pass to display the translation of plugin names, the readme content, descriptions and screenshots.

It uses a hidden post type plugin_translated and the filters the_title, get_the_excerpt, get_post_metadata and wporg_plugins_content. The last one is a custom filter because get_the_content() doesn't have a filter. I read through some core tickets and adding a filter there might be not the best idea. A filter in get_post() like proposed in #core12955 could be helpful here.

To store the translation of plugin we can use

<?php
$id = wp_insert_post( [
        'post_type'    => 'plugin_translated',
        'post_status'  => 'publish', // Inherit?
        'post_name'    => 'ja', // strtolower( get_locale() )
        'post_title'   => 'translated',
        'post_content' => "<!--section=description-->\nTranslated description\n<!--section=installation-->\nTranslated installation\n<!--section=changelog-->\nTranslated changelog",
        'post_excerpt' => 'translated',
        'post_parent'  => 180, // ID of a plugin post
], true );

add_post_meta( $id, 'screenshots', [ 1 => 'Translated screenshot description', 2 => 'Translated screenshot description', 3 => 'Translated screenshot description', 4 => 'Translated screenshot description', 5 => 'Translated screenshot description' ] );

#12 follow-up: @tellyworth
8 years ago

attachment:class-plugin-i18n.diff adds a Plugin_I18n class for translating plugin content. Includes filters for the_title, the_content, get_the_excerpt. With one minor change to the theme, it works for me.

The Plugin_I18n code is mostly adapted from plugins-plugins/svn-track/class.dotorg-plugins-i18n.php.

Currently, translation is done at display time via the filters. If we decide to switch to storing translated content in postmeta for search, the class should be reusable more or less as-is - replace the output filters with something that stores meta at update time.

Search translation will work easily if translated content is available in postmeta, for example as fr_title and fr_content. With this patch, an additional filter shim can hook into the Jetpack sync and add fake translated content in meta keys, much like filter_shim_postmeta does now for the downloads count. That gives us translated content and translated search without requiring additional data storage; and in a way that could be easily adapted to store translated data (in postmeta) if that becomes necessary for performance or other reasons.

#13 in reply to: ↑ 12 @ocean90
8 years ago

Replying to tellyworth:

Search translation will work easily if translated content is available in postmeta, for example as fr_title and fr_content.

Wish I would have known this sooner. :) Post meta is definitely easier since we don't have to care about the relationships.

If we want to go live without a persistent storage I'm fine with that since I can't spend much time on this at the moment. But I still think that the whole readme translation process needs an overhaul long-term (like parsing the readme.txt instead of HTML).

class-plugin-i18n.diff looks good at a first glance. Beside some code styling issues:

  • Shouldn't filter moved to the if ( 'en_US' != get_locale() ) {} condition?
  • What's the purpose of wp_get_locale()? Looks like it's just get_locale(). (Note: WP_LANG doesn't exist.)
  • Let's replace all translate_ table prefixes with GLOTPRESS_TABLE_PREFIX

And while looking at the code I remembered that we already have a [
https://meta.trac.wordpress.org/browser/sites/trunk/wordpress.org/public_html/wp-content/plugins/glotpress-translate-bridge/glotpress-translate-bridge.php GlotPress Translate Bridge] which we're using for themes... I think there are some opportunities to combine both.

#14 @tellyworth
8 years ago

In 3319:

Display translated readme content on the front-end.

This does on-the-fly translation rather than storing the data. It could and probably should eventually be changed to store that content in postmeta, in a way that is compatible with ElasticSearch indexing.

See #1691

This ticket was mentioned in Slack in #meta by obenland. View the logs.


8 years ago

This ticket was mentioned in Slack in #meta by obenland. View the logs.


8 years ago

#18 @obenland
8 years ago

In 3386:

Plugin Directory: Add support for hreflang link attributes.

See #1691.

#19 @ocean90
8 years ago

In 3391:

Plugin Directory: Use the correct table prefix for the GlotPress tables.

See #1691.

#20 @ocean90
8 years ago

There is a PHP notice on a dashboard of a localized site: Notice: Trying to get property of non-object in /wp-content/plugins/plugin-directory/class-plugin-directory.php on line 501. And [3319] doesn't handle screenshot descriptions.

#21 @obenland
8 years ago

In 3451:

Plugin Directory: Fix undefined property notices on rosetta sites.

See #1691.

#22 @ocean90
8 years ago

In 3468:

Plugin Directory: Pass the post object to Plugin_I18n::translate().

Fixes missing translations for excerpts on the front page.

See #1691.

#23 @ocean90
8 years ago

In 3492:

Plugin Directory: Add a simple Slack client which will be used for plugin imports.

See #1691.

#24 @ocean90
8 years ago

In 3497:

Plugin Directory: Initial commit for a readme POT generator/importer.

See #1691.

#25 @tellyworth
8 years ago

In 3498:

Fix some issues with the pseudo-meta translated content for JP sync.

See #1691, #1692

#26 @ocean90
8 years ago

In 3503:

Plugin Directory: Add Makepot lib as a SVN external.

See #1691.

#27 @ocean90
8 years ago

In 3507:

Plugin Directory: Initial commit for a code POT generator/importer.

See #1691.

#28 @ocean90
8 years ago

In 3515:

Plugin Directory: Import translations on initial commit.

See #1691.

#29 @ocean90
8 years ago

In 3518:

Plugin Directory: Enable i18n processing for our test plugin.

See #1691.

#30 @tellyworth
8 years ago

In 3519:

Plugin directory: include all available translations in pseudo-meta for search.

Also add some defensive code, better testing and bugfixes.

See #1691, #1692

#31 @tellyworth
8 years ago

In 3521:

Plugin directory: fix title translation in [3519]

See #1691

#32 @tellyworth
8 years ago

In 3522:

Plugin directory: search locale-specific fields when get_locale() is non-English.

English fields are still searched as a low-weighted default, in case translated content is not available.

See #1691, #1692

#33 @ocean90
8 years ago

In 3523:

Plugin Directory: Move plugin validation into a separate method.

See #1691.

#34 @tellyworth
8 years ago

In 3543:

Plugin directory: use WP_Http to query the translate API.

See #1691, #1692

#35 @ocean90
8 years ago

In 3549:

Plugin Directory: Import readmes to the correct GlotPress project.

See #1691.

#36 @ocean90
8 years ago

What's missing here:

  • Screenshot descriptions are not translated. 1691.3.patch includes a filter for the post meta which could be used.
  • A logger for GlotPress imports which sends the output to Slack. The client was added in [3492].

This ticket was mentioned in Slack in #meta by obenland. View the logs.


8 years ago

#38 @obenland
8 years ago

  • Milestone changed from Plugin Directory v3 - M5 to Plugin Directory v3 - M6

This ticket was mentioned in Slack in #meta by obenland. View the logs.


8 years ago

This ticket was mentioned in Slack in #meta by obenland. View the logs.


8 years ago

#41 @obenland
8 years ago

  • Milestone changed from Plugin Directory v3 - M6 to Plugin Directory v3 - M7

This ticket was mentioned in Slack in #meta by obenland. View the logs.


8 years ago

This ticket was mentioned in Slack in #meta by obenland. View the logs.


8 years ago

This ticket was mentioned in Slack in #meta by tellyworth. View the logs.


7 years ago

#45 @tellyworth
7 years ago

  • Milestone changed from Plugin Directory v3 - M7 to Plugin Directory v3 - M8

#46 @tellyworth
7 years ago

  • Milestone changed from Plugin Directory v3 - M8 to Plugin Directory v3 - M9

This ticket was mentioned in Slack in #meta by tellyworth. View the logs.


7 years ago

#48 @tellyworth
7 years ago

  • Resolution set to fixed
  • Status changed from reviewing to closed

Closing as fixed - if there are specific issues with i18n they belong in new tickets.

#49 @samuelsidler
7 years ago

  • Milestone Plugin Directory v3 - M9 deleted

Milestone Plugin Directory v3 - M9 deleted

Note: See TracTickets for help on using tickets.