WordPress.org

Making WordPress.org

Opened 3 months ago

Last modified 7 days ago

#5207 assigned enhancement

Improve heuristics for detecting block assets

Reported by: tellyworth Owned by: coreymckrill
Milestone: Priority: high
Component: Plugin Directory Keywords: needs-patch
Cc:

Description

The block directory attempts to detect block names and assets, so they can be returned by the API.

If the scripts and blocks are defined in a block.json file, there's no ambiguity. But since almost no plugins use block.json files as yet, the plugin directory's svn import code contains some heuristic code that attempts to guess which asset files are needed.

Currently those heuristics are lacking, so many (most?) blocks in the block directory don't have any assets stored or returned by the API.

Here's an example of a block that does have some block_assets:

https://api.wordpress.org/plugins/info/1.2/?action=query_plugins&request[block]=listicle&request[per_page]=3&request[locale]=en_US&request[wp_version]=5.5

And here's one that doesn't:

https://api.wordpress.org/plugins/info/1.2/?action=query_plugins&request[block]=rubi&request[per_page]=3&request[locale]=en_US&request[wp_version]=5.5

The heuristic code is here:

https://meta.trac.wordpress.org/browser/sites/trunk/wordpress.org/public_html/wp-content/plugins/plugin-directory/cli/class-import.php#L445

Currently it basically just looks for JS and CSS files in the build or dist subdirs. That needs to be made smarter, perhaps with special cases for some of the plugins currently in the block directory (https://wordpress.org/plugins/browse/block/).

Attachments (3)

5207_tag-path.diff (1.4 KB) - added by coreymckrill 3 months ago.
Fix block files path when $stable_tag is a version number
5207-assets.diff (6.1 KB) - added by coreymckrill 3 months ago.
First stab at refactoring block and asset detection
5207-subdir-check.diff (5.7 KB) - added by coreymckrill 12 days ago.

Download all attachments as: .zip

Change History (18)

#1 @ryelle
3 months ago

I think plugins that use /tags/* are not indexed correctly— check out Listicles, it returns assets as "/1.0.6/src/block/listdd.js", …, but there are two problems:

  1. That doesn't translate to the full path, it should return tags/1.0.6/…, since this creates the link https://plugins.svn.wordpress.org/listicles/1.0.6/src/block/listdd.js
  2. That's not the right file to include, it's not finding the files in dist — because of the above problem, export_and_parse_plugin is searching in {$plugin_slug}/{$stable_tag}, and $stable_tag is just the version number.

@coreymckrill
3 months ago

Fix block files path when $stable_tag is a version number

#2 @coreymckrill
3 months ago

With the above patch, which addresses the problem noted by @ryelle, the paths used to find files and the paths stored in postmeta will now include tags/ if the stable tag is a version number instead of trunk.

Running this on a couple of known block plugins, we get new results:

https://api.wordpress.org/plugins/info/1.2/?action=query_plugins&request[block]=listicle&request[per_page]=3&request[locale]=en_US&request[wp_version]=5.5

This now shows files from dist instead of src, and includes /tags in the file paths.

https://api.wordpress.org/plugins/info/1.2/?action=query_plugins&request[block]=rubi&request[per_page]=3&request[locale]=en_US&request[wp_version]=5.5

This didn't have any block assets shown before, but now has /tags/1.0.1/build/index.js

We should probably still add some additional heuristics here, but this is a good first step.

#3 @tellyworth
3 months ago

In 9892:

Plugin dir: improve block asset detection

Fixes the logic for handling tag paths.

Props coreymckrill, ryelle.
See #5207.

#4 @tellyworth
3 months ago

Nice! That takes it up from 13 plugins with asset files, to 28 (of 32 currently). Currently the ones missing assets are:

layout-grid
pdf-viewer-block
embed-block-for-github
starscape

There's a separate but similar issue with the block metadata stored in the all_blocks array. Mainly because of the limitations of using regexes to detect the registerBlockType calls. The plugins in the block directory missing all_blocks are:

scratch-pad
simple-definition-list-blocks
emoji-conbini
ghstwrtr
meme-me
lazy-lists
altered-reality

#5 @coreymckrill
3 months ago

The plugins in the block directory missing all_blocks

I haven't checked the whole list, but at least a couple of them (both from sortabrilliant) aren't actually blocks. scratch-pad is an editor plugin that adds a text field to the block editor sidebar, but doesn't register any blocks. ghstwrtr adds a block style, but no block types.

In simple-definition-list-blocks our regex isn't picking up the registerBlockType call because it expects there to be parentheses and parameters included, but in this plugin the minified JS looks like this:

Object(l.registerBlockType)("simple-definition-list-blocks/list",r);

What if we modified the regex to just look for registerBlockType? Seems like the possibility of false positives is pretty low.

#6 @coreymckrill
3 months ago

Ah nevermind, then we wouldn't be able to get the name and title properties for the block...

@coreymckrill
3 months ago

First stab at refactoring block and asset detection

#7 @coreymckrill
3 months ago

The above patch refactors block and asset detection to prioritize data from any existing block.json files first. Then if those aren't available, or they don't contain any of the style/editor properties, it goes back to hunting for patterns, starting with build/dist directories.

This narrows down the list of blocks with no assets to one: pdf-viewer-block. This block unfortunately, I believe, has a directory structure that we cannot easily traverse using heuristics, without potentially getting a bunch of false positives.

This patch also improves the hits on existing block assets lists, reducing false positives. There were a couple that previously included both src and build versions of their scripts, but this is more successful at isolating just build files.

This patch does not do much to fix the list of plugins that are missing blocks. As noted above, I believe some of these don't contain actual blocks to begin with.

#8 @coreymckrill
3 months ago

Alright, so from the list above of block plugins with nothing in all_blocks, only simple-definition-list-blocks actually registers a block type. As mentioned above, the usage of registerBlockType is in a minified JS file and does not match with our regex.

All of the other plugins in the list modify existing blocks, add block styles, and/or add functionality to the block editor UI. But they don't actually add new blocks. Should these be in the block directory?

#9 @tellyworth
3 months ago

In 9934:

Plugin dir: improve block asset detection during import.

This refactors the code and does a smarter job of finding asset files. It successfully works with most blocks currently in the directory.

Props coreymckrill.
See #5207

#10 @tellyworth
3 months ago

I think we should probably remove those questionable blocks from the directory for now. They can always be re-submitted later.

#11 @tellyworth
2 months ago

In 9937:

Plugin dir: improve detection of blocks in code.

Use a more flexible regex for catching registerBlockType calls in JS code, especially minified code.

See #5207

#12 @tellyworth
2 months ago

In 9938:

Plugin dir: fix double slashes in patchs when importing assets.

See #5207

#13 @ryelle
2 weeks ago

I noticed that Voice Blocks isn't showing up in the directory because block_files is empty. It looks like there's an issue finding files in subdirectories that aren't build or dist, so the importer isn't finding the files blocks/voice/*

#14 @coreymckrill
12 days ago

The above patch does three things:

  • Adds info/warnings to the block plugin checker tool output to show how many JS/CSS asset files were found during the check.
  • Ensures that the block checker's call to Import::find_possible_block_assets includes other directories that might have asset files besides the plugin's root directory.
  • Updates the heurisitics in the Import::find_possible_block_assets method to recursively search possible directories for asset files, rather than only the directory itself.

With these changes, the Voice Blocks plugin shows 2 JS files and 0 CSS files in the checker. It looks like there are CSS assets in the plugin, but they are in a separate directory with no connection to the block's script files, so I'm not sure how we'd find those without potentially introducing a ton of false positives.

These changes themselves already do find a lot more files that are considered assets, and I'm not sure how many are false positives. When running bin/check-block.php (with some output modifications), here's the before and after for this patch:

Before / After

Total JS assets: 129 / 160
Total CSS assets: 77 / 84
Block plugins with no assets: 3 / 0

I guess the question is, will introducing these additional assets, some of which might be false positives, break the block directory?

#15 @ryelle
7 days ago

In get_detailed_help, you don't need to do the block data check, since this is only output on error & warning issues. It should never get here if data > 0. If you want to leave the check in, it should return a string to display, or false.

Finding extra files shouldn't be a problem right now, but I'm also seeing it not find some valid files. Testing with a recipe block I wrote, it fails to find my CSS file, and instead finds an extra JS file:

Found 0 CSS assets.
Found 2 JS assets.
Found file assets/js/blocks/recipe-directions/index.js.
Found file /build/recipe-block.js.

vs before the patch, it pulls from block.json correctly:

Found file /build/recipe-block.js.
Found file /build/recipe-block.css.
Note: See TracTickets for help on using tickets.