WordPress.org

Making WordPress.org

Opened 12 months ago

Last modified 12 months ago

#2925 new defect

occurence of literal '\0' in docblock `@param` of wp_kses_no_null() causes a null character in the corresponding meta_value

Reported by: pbiron Owned by:
Milestone: Priority: normal
Component: Developer Hub Keywords: has-screenshots
Cc:

Description

The docblock for wp_kses_no_null() is as follows:

<?php
/**
 * Removes any invalid control characters in $string.
 *
 * Also removes any instance of the '\0' string.
 *
 * @since 1.0.0
 *
 * @param string $string
 * @param array $options Set 'slash_zero' => 'keep' when '\0' is allowed. Default is 'remove'.
 * @return string
 */

The occurrence of \0 in the description of @param array $options results in a null character being written to the corresponding meta_value.

I discovered this after exporting my local mirror of DevHub and then doing an XML parser of the WXR: the null character is illegal in XML and was reported as such by the XML parser.

I don't know if this is a problem with parsedown or somewhere else in phpdoc-parser. Oddly enough, the occurrence of \0 in the short description causes no problems.

Double-escaping \0 in the @param corrects the problem, as in:

<?php
/**
 * Removes any invalid control characters in $string.
 *
 * Also removes any instance of the '\0' string.
 *
 * @since 1.0.0
 *
 * @param string $string
 * @param array $options Set 'slash_zero' => 'keep' when '\\\\0' is allowed. Default is 'remove'.
 * @return string
 */

As of today, this is the only occurrence of \0 in a @param in all of the sources

Attachments (2)

2925.patch (539 bytes) - added by Dency 12 months ago.
wp_kses_no_null.png (14.9 KB) - added by pbiron 12 months ago.
screenshot from Code Reference that shows how misleading the current display in the CodeRef is

Download all attachments as: .zip

Change History (7)

This ticket was mentioned in Slack in #docs by pbiron. View the logs.


12 months ago

@Dency
12 months ago

This ticket was mentioned in Slack in #meta by dixita. View the logs.


12 months ago

#3 follow-up: @Otto42
12 months ago

Umm, no, we are not changing the core docblocks to fix a problem in the devhub parser.

If the devhub parser can't handle it, then the parser needs to be fixed.

#4 in reply to: ↑ 3 @pbiron
12 months ago

Replying to Otto42:

Umm, no, we are not changing the core docblocks to fix a problem in the devhub parser.

If the devhub parser can't handle it, then the parser needs to be fixed.

I completely agree! For example, see #2900.

I mentioned the double-escaping "fix" for a couple of reasons:

  1. as possible work around if the parsedown or phpdoc-parser can't be quickly fixed...because as it stands, the display of that Code Ref page is misleading and could result in developers thinking wp_kses_no_null() is not doing what the documentation says it should.
  2. I haven't had the time to dig into parsedown or the rest of phpdoc-parser to be positive they are the cause of the problem; tho as stated in the ticket description, I'm think that's where the problem lies.
  3. I wasn't sure that phpdoc-parser was actually supposed to handle that escape sequence. I haven't been able to find any documentation on what "markdown/escape sequences" are acceptable in docblocks. There is nothing about this mentioned in PHP Documentation Standards about this. Hint: it would be really good to document that so that contributors know what is/is not allowed.

@pbiron
12 months ago

screenshot from Code Reference that shows how misleading the current display in the CodeRef is

#5 @pbiron
12 months ago

  • Keywords has-screenshots added

As shown in the screenshot, the current display of this page in the Code Ref suggests that array( 'slash_zero' => 'keep' ) means that the empty string is allowed, where it actually means the null character is allowed.

Last edited 12 months ago by pbiron (previous) (diff)
Note: See TracTickets for help on using tickets.