Tips on sanitizing and validating WordPress plugin data

When I submitted the free version of ChimpBridge, I got an email back saying it was rejected.

But why?

Because data coming into the plugin wasn’t always being validated properly. Though a lot of the code was created by someone else, that’s no excuse and I should have done a better job checking that data is being properly validated and escaped before being saved to the database, and I also made some mistakes myself.

I thought I’d document some of the mistakes they caught and others I found while going through things line by line, in the hopes it will help you too!

Validate any $_POST, $_GET and $_REQUEST data before using/saving

There was a line in the code which just used the data direct from $_POST:

$listID = $_POST['listID'];

The POST data could be anything, and should never be trusted.  You’ve got a bunch of options to validate that data depending on what type of data you expect.

Is the data numeric?

In this case you can use intval or floatval to take the data and ensure it’s either an integer (no decimals) or float (with decimals):

intval( $_POST['postID'] );

Should the data only contain certain values?

In this case you can either do a direct check for the value, like:

'chimpbridge' === $_GET['post_type']

with a triple === to match type, or a few values, doing something like:

in_array( $_GET['value'], array( 'value1', 'value2', 'value3' ), true )

with the last parameter to in_array as true so it’s a type-specific comparison (thanks Stephane for the suggestion!).

Or you could use a switch statement.

Does the data contain a URL?

For example before making a remote request, the data can be escaped for any invalid or dangerous characters:

$response = wp_remote_request( esc_url_raw( $url ), $request );

If you’ll be displaying it on an HTML page use esc_url instead.

Should the data contain text, but no HTML?

Using strip_tags isn’t enough to sanitize the text data, but you can use something like:

sanitize_text_field( $_POST['my_text'] )

If the type of text data is more specific (such as an email address), there are more specific functions you can use such as sanitize_email.

Could the data contain ‘anything’ (including HTML)?

This one I wasn’t sure how to handle at first since really, I’m pretty much cool with any content being included.  But in reality there’s only a limited number of tags that makes sense.

If you want to be specific about what tags are allowed, you can use wp_kses directly, for example just a tags with the href and title attributes:

wp_kses( $_POST['data'], array( 'a' => array( 'href' => array(), 'title' => array() ) ) );

Alternatively, you can use the wp_kses_data (for only allowing limited HTML that post comments would have) or wp_kses_post (for more HTML that would be allowed in a post, which is what I used most of the time).

Escape data before outputting

PHP (Server Side)

Just because data was retrieved from the database, it still shouldn’t be trusted to not contain any malicious data.

This includes everything you echo out in your settings page and elsewhere.  As one example:

echo '<option value="' . $segment['id'] . '" ' . selected( $segment['id'], $selected_value ) . '>' . $segment['name'] . '</option>'

should be:

echo '<option value="' . esc_attr( $segment['id'] ) . '" ' . selected( $segment['id'], $selected_value ) . '>' . esc_html( $segment['name'] ) . '</option>'

since there should be no HTML contained in the output at all. The selected WordPress function is used to output the selected=”selected” HTML.

If the output could contain HTML, you can use the wp_kses, wp_kses_data or wp_kses_post functions again:

echo wp_kses_post( $data );

WordPress VIP wrote a nice piece with more examples on escaping you can check out!

JavaScript

The data that comes in from an AJAX call should never be trusted, even if it’s your own site.  One example was adding options for a MailChimp list or segment:

selectField.append( '<option value="' + value.id + '">' + value.name + '</option>' );

The id or name could contain malicious script/HTML.  From a WordPress VIP article, create the element in JavaScript and inject it instead:

var option = jQuery( '<option />' );
option.attr( 'value', value.id );
option.text( value.name );
selectField.append( option );

This way jQuery will sanitize the data for you. Do NOT do something like this to strip out the HTML:

myDiv.html( value ).text()

as this is still vulnerable to attack. It’ll fire events during the .html part.

Conclusion

Never trust anything you get from the user, the database, or the results of an AJAX call (even if it’s from your own site).

Want some practice? There’s a nice exercise of an intentionally vulnerable WordPress plugin (with answers on how to fix, no cheating before you try first), how many security issues can you find?