WordPress.org

Making WordPress.org

Opened 4 months ago

Last modified 3 months ago

#5152 assigned enhancement

Put a limit on adding new translation after multiple warnings

Reported by: Nao Owned by: dd32
Milestone: Priority: normal
Component: Translate Site & Plugins Keywords:
Cc:

Description

There should be some kind of block for users to enter poor quality translation suggestions, including unreviewed machine-translated text.

These are some points I tried to gather to avoid wrongly flagging novice translators who are just making some mistakes as they learn while giving a hard stop to users who try to add translations that burden locale teams.

Idea

For both file upload (import) & manual entry, put a limit on a user after their translations have a certain number of warnings without correcting them.
The limit can be a ban for a set period of time, or ban until mistakes are corrected (or both?).

Flag these warnings

  • HTML mangled: Certain patterns that are obvious signs of a machine translation should be rejected (e.g. space around opening/closing tags). But we need to be aware regular contributors can make a typo.
  • Extra HTML attribute
  • Placeholders missing or extra added

Ignore, or be lenient on these warnings

  • Different URL from the original: This is often necessary for pointing to localized docs.
  • Accidental newlines: should be automatically matched to the original.

Not sure about this

  • Missing/too many tags: Sometimes I explicitly do this for an em element (remove them and replace them with a different way of emphasis) since italic text is not common in Japanese writing. I can't think of any other case though, and if other locales have something like this.

Other considerations

  • Extra flag for a user submitting to multiple locales with many warnings.
  • Banned user should be visible in #polyglots-warnings or somewhere private for GTEs or global mentors.
  • Never ban GTEs and PTEs on projects where they have validating rights.
  • Sometimes translators keep adding several translations with warnings because they don't know they can reject their own translation from before. Should we overwrite (or ask?) when a new translation is being submitted for a previously translated string by the same user?

Related: #4171

Attachments (2)

5152.diff (4.3 KB) - added by ocean90 4 months ago.
5152-logging.png (561.9 KB) - added by ocean90 3 months ago.

Download all attachments as: .zip

Change History (16)

This ticket was mentioned in Slack in #polyglots by nao. View the logs.


4 months ago

#2 @jeroenrotty
4 months ago

What could be an idea is to integrate some of the checks the GlotDict extension brings into the GlotPress-WP project by default, that does a few extra checks on grammar, tags, ending or beginning spaces etc. GlotDict can be found here: https://github.com/Mte90/GlotDict

#3 @garrett-eclipse
4 months ago

I can give a hand migrating the GlotDict warnings (JS) into the custom warnings (PHP) plugin here;
https://meta.trac.wordpress.org/browser/sites/trunk/wordpress.org/public_html/wp-content/plugins/wporg-gp-custom-warnings/wporg-gp-custom-warnings.php

Would all GlotDict checks be desired?

  • Validation for final "...", ".", ":"
  • Validation for final ;.!:、。؟?!
  • First letter in translation is not uppercase but the original string is
  • Detect first and last character if they are space
  • Missing term translated using the locale glossary
  • Check for curly apostrophe
  • Check for non typographic quotes

And would it be toggle-able via user settings? In GlotDict all warnings can be silenced individually if they aren't a fit for the locale or translator.

#4 @Nao
4 months ago

I always thought this could be a good warning to display by default!

  • Missing term translated using the locale glossary

These sound good too but we probably shouldn't give heavy penalties toward banning/limitation?

  • Check for curly apostrophe
  • Check for non typographic quotes
  • Detect first and last character if they are space

These rules are great for many languages but could be confusing to others if they get warnings when writing properly in their language (especially for newbies).

  • Validation for final "...", ".", ":"
  • First letter in translation is not uppercase but the original string is (e.g. languages without upper/lowercase characters)

I only know two languages, so other opinions are welcome! :)

#5 @garrett-eclipse
4 months ago

Thanks @Nao for the feedback, if we do want to adopt the warnings that could be confusing I would suggest we make them optional or not bother to implement them specifically. These options can be handled through the Translation Settings (https://translate.wordpress.org/settings/)

#6 @Mte90
4 months ago

GlotDict creator here (but @garrett-eclipse is one of the maintainer too).
We can migrate it in Glotpress or translate.wp.org or add new ones is not a problem.
I think that the big problem in GlotPress is that is not possible to turn off specific warnings. Right now there is no feature to customize per users this kind of things so will require some development in GlotPress probably.
A lot of them during the years were added based on requests and for specific languages, so probably is better to migrate as first the ones that are mandatory for all the languages.

#7 @ocean90
4 months ago

Limiting users based on triggered warnings and adding new warnings should be handled separately. This ticket should only be about creating solid rules for a limit. Though, I agree that some of warnings from GlotDict should be contributed back to GlotPress itself and not only to translate.w.org.

Currently we only log discarded warnings into a database table while new warnings are only pushed to the #polyglots-warnings channel on Slack.
As a quick first step we should extend the database logging for warnings in general so we have some structured data which can be exported. This data can then be used to make some data driven decisions.

#8 @ocean90
4 months ago

In 9739:

Translate: Add timestamp to discarded warning logging.

See #5152.

@ocean90
4 months ago

#9 @dd32
4 months ago

As a quick first step we should extend the database logging for warnings in general so we have some structured data which can be exported. This data can then be used to make some data driven decisions.

I was surprised to find out that the warnings are only stored within the translations table, so storing them like that is definitely a +1 from me

This ticket was mentioned in Slack in #polyglots by nao. View the logs.


4 months ago

#11 @ocean90
3 months ago

In 9890:

Translate: Log translation warnings to a database table for analysis.

See #5152.

#12 @ocean90
3 months ago

In 9891:

Translate: Log multiple warnings for the same translation.

See #5152.

@ocean90
3 months ago

#13 @ocean90
3 months ago

The extra logging for warnings is now enabled. I'm providing an export in a week or two so we can review the data.

#14 @ocean90
3 months ago

In 9893:

Translate: Include the message of a warning in translation warnings logging.

See #5152.

Note: See TracTickets for help on using tickets.