#5154 closed defect (fixed)
Automatically fix some translation errors
Reported by: |
|
Owned by: | |
---|---|---|---|
Milestone: | Priority: | normal | |
Component: | Translate Site & Plugins | Keywords: | |
Cc: |
Description
Split off from #5152
There are some common translation warnings which can be automatically corrected, simplifying the translation process by reducing the amount of time a translator (and editor) have to spend checking translations.
The two main ones I see right now that can be automatically fixed are:
- Newlines, either prefixed or suffixed to originals/translations
- Unicode percent signs being used rather than ASCII percent signs in placeholders
Are there any others? I'm hesitant to fix mangled HTML tags, as although most are usually just an extra space around a <
or </
it's a good sign of a machine translation that's usually not perfect.
Change History (16)
#2
@
12 months ago
I agree that automatically fixing signs of machine translation should be avoided.
As a vector for the future, perhaps some automatic fixes could be offered for specific locales, but that would probably become a project on its own.
This ticket was mentioned in Slack in #polyglots by casiepa. View the logs.
12 months ago
#4
@
12 months ago
Looking at experience with other tools like Transifex, PoEdit, Lokalize and I think Crodin I think that is better to suggestions that fix automatically for you.
Like an opt-in way for any sentence.
Example: the sentence is missing a final dot, would you want to add it? Yes/No and automatically does for you.
This can be helpful for sanitization and let to create custom warnings for specific locales like to use different unicode symbols and so on.
#5
@
12 months ago
I agree that not everything should be fixed automatically, this was a case where the fixes were obvious and "always" right and has significantly reduced the amount of warnings (IMHO)
Some warnings/fixes could be applied prior to the submission of the string, highlighting missing placeholders or tags prior to submission, etc. and offering automatic fixes there would make sense for things like HTML tags or highlighting the original/translation additions/deletions.
This ticket was mentioned in Slack in #polyglots by nao. View the logs.
12 months ago
#9
@
11 months ago
In monitoring the #polyglots-warnings channel, the only other things I've seen that would be reasonable to autocorrect are:
- non-ASCII
$
in printf placeholders, there's a few various other unicode variants of the dollar sign - non-ASCII characters used in printf placeholders, such as a unicode S variant
Those seem to happen very rarely, so I'm going to skip adding anything for those and close this ticket as fixed for now.
There's another ticket to add some JS-based warnings pre-submit as well, which will hopefully remove the need for this in the first place and/or support auto-fixing some warnings.
If the warning logging that will hopefully be added as part of #5152 reveals anything major, we can re-open or create a new ticket.
#16
@
8 weeks ago
Follow up to remove some warnings: #5621
Another auto-correct could be, this is both to speed up translators in-the-know, but also as it seems to be a common issue looking through the generated warnings.
- If no printf placeholders are present in translation, exist in the original, and
%
(space inclusive) is contained within the translation, replace them in order. ie.% ba %
would end up as%s to %s
if that's what the original was.
Leaving this as closed, just noting the idea.
In 9741: