Using Jetpack Popular Posts in WP_Query

Jetpack, Automattic’s fremium, all-in-one utility belt for self-hosted WordPress sites, has a number of extremely useful tools for sites of all sizes. Whether you’re using Protect (formerly BruteProtect) to limit malicious login attempts, Akismet to block spam, or Related Posts to cross-link content, Jetpack has at least one or two modules for pretty much anyone.

Of course, Jetpack is meant to be a “one size fits all” approach to WordPress; its engineers have designed Jetpack to be as simple as possible for people to start using, but often professional sites need professionally-customized features. Fortunately, Jetpack’s source is pretty-well documented, and we’re able to build on top of existing functionality.

Today, we’re going to use Jetpack’s stats_get_csv() function — normally used to populate the “Top Posts” widget — to create a WP_Query object that we can use like we would any other.

Why would I want my popular posts in a WP_Query object?

There are plenty of cases where you may want to highlight the content that’s currently popular on your site. A perfect example is the Growella homepage: as of this week, all of the homepage content is driven by what’s most popular/trending:

The Growella homepage, including the new "Now Trending" section, driven by popular posts.

While Jetpack gives you a “Top Posts” widget, the amount of work to twist that widget into the Growella homepage would be enormous. Instead, we can leverage the same functionality the widget is using to query WordPress.com (via Jetpack) about our most popular content.

Getting Jetpack popular posts from WordPress.com

Inside modules/stats.php, Jetpack defines the stats_get_csv() function; given a few arguments, the function serves as our API to collect information about our site from WordPress.com.

If, for example, we wanted to get the 5 most popular posts from the last week (7 days), we could do so with a single function call:

$popular_posts = stats_get_csv( 'postviews', array(
  'days'  => 7,
  'limit' => 5,
) );

It’s important to note, however, that the array we get back from the function is not a WP_Query object; instead, we get an array of arrays, each representing a post:

Array (
  [post_id] => 123
  [post_title] => My Post Title
  [post_permalink] => https://example.com/my-post
  [views] => 1500000
)

If you were to try creating a loop using $popular_posts, you wouldn’t get very far. Instead, we’ll use the response from WordPress.com to build our own WP_Query object of popular posts:

The easy way: post__in

If you only need a limited number of popular posts, you might consider collecting the post IDs from WordPress.com, then using those to populate a post__in argument for WP_Query:

$posts = wp_list_pluck( $popular_posts, 'post_id' );
$query = new WP_Query( array(
  'post__in' => $posts,
  'orderby'  => 'post__in',
  'order'    => 'ASC',
) );

That function uses the little-known (but awesome) wp_list_pluck() function to collect the “post_id” value for each nested array in $popular_posts, which are then used to populate the post__in argument for WP_Query. Notice that we’re also defining the orderby and order arguments, which ensure that the most popular post is displayed first in the results.

The less-easy way: custom ordering of query results

In previous versions of the Growella homepage, the “easy way” worked perfectly: we were able to build a WP_Query object of our most popular posts for the “More popular articles” section, and if there were no popular posts for whatever reason we could simply hide the section.

While working on our staging site, however, it occurred to me that we would face a problem: new content is regularly being added to Growella.com, but the staging site is falling more and more out-of-sync with each passing day. It’s not unreasonable to assume that there will come a day when none of the most popular posts (according to Jetpack) existing in our staging environment, which means our post__in query argument would result in an empty result set.

This forced us to rethink how we approached popular posts: we want the most popular posts to be pushed to the front of “Now Trending”, but if we don’t have valid post IDs we still want something to appear. We were able to accomplish this by not using the post__in argument and instead writing a custom posts_orderby callback:

/**
 * Get the most popular posts from WordPress.com.
 *
 * @param array $args WP_Query arguments.
 * @return WP_Query A WP_Query object containing with the
 *                  most popular posts pushed to the front.
 */
function growella_get_popular_posts( $args = array() ) {
  // ...setup, checking for cached values, etc.

  // Alter the "ORDER BY" statement for this query only.
  add_filter( 'posts_orderby', 'growella_order_popular_posts' );
  $query = new WP_Query( $args );
  remove_filter( 'posts_orderby', 'growella_order_popular_posts' );

  // ...cache the value for the future.

  return $query;
}

/**
 * Instead of just using post__in, which could return
 * fewer-than-desired results, order the the results by
 * putting the most popular posts first.
 *
 * @global $wpdb
 *
 * @param string $orderby The current ORDER BY statement.
 * @return string The altered $orderby string.
 */
function growella_order_popular_posts( $orderby ) {
  global $wpdb;

  $popular  = stats_get_csv( 'postviews', array(
    'days' => 3,
  ) );
  $post_ids = wp_list_pluck( $popular, 'post_id' );
  $post_ids = array_map( 'absint', $post_ids );

  /*
   * MySQL wants to push the FIELD() results to the end of
   * the list if they're sorted ASC, so we're flipping the
   * array, then using descending order.
   */
  return sprintf(
    "FIELD({$wpdb->posts}.ID,%s) DESC, $orderby",
    implode( ',', array_reverse( $post_ids ) )
  );
}

In practice, this code will force the popular posts that also meet your query parameters to appear at the front of your results.

Understanding the SQL

The FIELD() function is key here:

FIELD() returns the index position of the searching string from a list of strings. If the search string is not found, it returns a 0 (zero). If the search string is NULL, the return value is 0 because NULL fails equality comparison with any value. w3resource.com

For a frame of reference, the SQL statement for a standard WP_Query instance looks something like this:

SELECT wp_posts.ID FROM wp_posts  WHERE 1=1
AND wp_posts.post_type = 'post' 
AND (wp_posts.post_status = 'publish' OR wp_posts.post_status = 'private')
ORDER BY wp_posts.post_date DESC LIMIT 0, 10

Now, let’s say your most popular posts (in order) have IDs 4, 8, 15, 16, 23, and 42. By applying our growella_order_popular_posts() function, our query now becomes:

SELECT wp_posts.ID FROM wp_posts  WHERE 1=1
AND wp_posts.post_type = 'post' 
AND (wp_posts.post_status = 'publish' OR wp_posts.post_status = 'private')
ORDER BY FIELD(wp_posts.ID,42,23,16,15,8,4) DESC, wp_posts.post_date DESC
LIMIT 0, 10

What’s great about this approach is that if, for whatever reason, one or more of the popular posts ID isn’t in the result set then the database will simply ignore it and instead include what would have been the 11th post at the end of the result set.

Leave a Reply