Making WordPress.org

Opened 9 months ago

Closed 2 months ago

Last modified 2 months ago

#2900 closed defect (fixed)

incorrect parsing of markdown in @param blocks

Reported by: pbiron Owned by: coffee2code
Milestone: Priority: normal
Component: Developer Hub Keywords: has-screenshots


There is a problem with phpdoc-parser incorrectly parsing some markdown in DocBlocs.

For example, see the $name__like @param in the hash of $query in WP_Term_Query::__construct().

The problem is actually in vendor/erusev/parsedown/Parsedown.php, rather than in phpdoc-parser, itself.

I just submitted a Pull Request against parsedown with a fix. If/when they accept/merge that PR, then DevHub should pull it and rebuild everything in the Code Reference.

If they don't accept/merge (or take a long time to do so), then a short term workaround is to modify the hash in all DocBlocs that use this pattern to be like:

 * ...
 *    @type string $name\__like  Retrieve terms with criteria by which a term is LIKE `$name\__like.
 * ...

i.e., add escapes to all occurrences of __ if when it appears twice in the same @param when the occurrence is not supposed indicate strong.

Attachments (1)

markdown-parsing-error.png (15.6 KB) - added by pbiron 9 months ago.
screenshot of WP_Term_Query::construct()

Download all attachments as: .zip

Change History (13)

9 months ago

screenshot of WP_Term_Query::construct()

#1 @pbiron
9 months ago

  • Keywords has-screenshots added

#2 @DrewAPicture
9 months ago

We already partially handle for this case in the \DevHub\get_params() function in the wporg-developer theme.

Doesn't make it any less of a problem, but we have already attempted to address this on DevHub. Though I notice that we're doing a str_replace() against $tag which is an array – should be $tag['content'].

Anyway, the problem in this particular case isn't actually bolding, it looks like the closing </code> tag isn't getting parsed properly from the closing backtick in the description for $name__like , which actually causes everything below it to be <code> formatted.

Last edited 9 months ago by DrewAPicture (previous) (diff)

#3 @DrewAPicture
9 months ago

Assuming we fix the $tag vs $tag['content'] thing, there's a still a problem with an opening <code> tag where it should be a closing </code> tag.

Here's what the raw section description looks like coming from parsedown before wporg-developer manipulates it for display:

Retrieve terms with criteria by which a term is LIKE                                                `$name__like<code>. Default empty.     @type string       $description__like

Following the fixed str_replace(), this is what we get:

Retrieve terms with criteria by which a term is LIKE                                                `$name</strong>like<code>. Default empty.     @type string       $description__like

Note the presence of the backtick before $name and <code> after like_

#4 @DrewAPicture
9 months ago

After a bit more digging, I'm thinking this might have to do with the <code> tags exception in format_param_description() added in [2206]. Removing 'code' from the array fixes the problem for display in my minimal testing.

#5 @DrewAPicture
9 months ago

Disregard what I said about $tag needing to be $tag['content']. Forgot that the subject in str_replace() can also be an array. Either way, in the scope of DevHub, the current problem with display has to do with the open <code> tag as described above.

Last edited 9 months ago by DrewAPicture (previous) (diff)

#6 @pbiron
9 months ago

Sorry I wasn't more clear in the description of this ticket.

The bug in parsedown only surfaces when all of the following are true:

  1. the param name contains __ (which parsedown treats as the start of <strong>)
  2. the param name also occurs in the param's description (which causes parsedown to treat the occurrence of __ in the param name as </strong>)
  3. the param name is wrapped in backticks, `, when it occurs in the param's description (because parsedown hasn't yet closed </strong>, it doesn't encode the opening backtick but does encode the closing backtick as the start of <code>)

Thus, the parsedown bug results in "unbalanced" tags. It's analogous to the the "Save as HTML/XML" bugs in old versions of MS Word, which often produced things like:

<p>this <b>is a <em>test</b> of the emergency</em> broadcast system</p>

when the Word document contained overlapping formatting.

I don't think it's worth hacking the DevHub theme to get around the parsedown bug, because trying to "rebalance" overlapping tags is hard and never foolproof (plus the fix to parsedown is so easy).

In case you didn't check out the PR I made against parsedown, you can temporarily fix your local copy by changing:

protected $StrongRegex = array(
    '*' => '/^[*]{2}((?:\\\\\*|[^*]|[*][^*]*[*])+?)[*]{2}(?![*])/s',
    '_' => '/^__((?:\\\\_|[^_]|_[^_]*_)+?)__(?!_)/us',


protected $StrongRegex = array(
    '*' => '/^[*]{2}((?:\\\\\*|[^*]|[*][^*]*[*])+?)[*]{2}(?![*])/s',
    '_' => '/^__((?:\\\\_|[^_]|_[^_]*_)+?)__(?!_)\b/us',

and then reslurping the source for class-wp-term-query.php and you'll see that there is no need to even touch the DevHub theme.

This ticket was mentioned in Slack in #docs by pbiron. View the logs.

9 months ago

This ticket was mentioned in Slack in #core-docs by zzap. View the logs.

7 months ago

#9 @SergeyBiryukov
5 months ago

#3221 was marked as a duplicate.

This ticket was mentioned in Slack in #docs by coffee2code. View the logs.

2 months ago

#11 @coffee2code
2 months ago

  • Owner set to coffee2code
  • Resolution set to fixed
  • Status changed from new to closed

In 6448:

developer.wordpress.org: Fix an upstream Parsedown bug that introduces unbalanced 'code' tags.

The bug still exists upstream and this "fix" merely seeks to undo the damage. Can be removed if/when Parsedown is fixed (but shouldn't pose a problem if it isn't).

Props pbiron.
See https://github.com/erusev/parsedown/pull/515.
Fixes #2900.

#12 @coffee2code
2 months ago

@pbiron: Thanks for your work on this, identifying the combination of criteria that triggers the bug and with submitting a PR upstream!

The commit above "fixes" the errant markup produced by the Parsedown bug. While the bug still exists upstream, I marked this ticket as fixed since there's not much else we can do on the DevHub side of things.

If you want to keep the issue alive, a ticket on the phpdoc-parser GitHub repo would seem more appropriate at this stage, to document that the bug exists, that it's a result of a bug in a dependency, and link to the PR that fixes it. With that as a reminder, someone can periodically check upstream to see if it ever gets fixed.


Note: See TracTickets for help on using tickets.