I was working on some content migration today and needed to find the first image in a post and set that as the featured image.  This was over 2000 imported posts, and so it was quicker to write a script than do it manually. Here’s the basic code to find the source of the first image from a post. This code should be run within a loop:

$image_urls = array();
$match_count = preg_match( '/<img .*src=["\']([^"\']*)["\'].*>/', get_the_content(), $image_urls );
if (is_numeric($match_count) && $match_count > 0) {
  $featured_image_url = $image_urls[1];
  echo "Found image with URL: " . $featured_image_url . "<br />\n";
  echo "Here it is: <br />";
  printf("<img src=\"%s\"><br />\n", esc_url($featured_image_url));
}

Now, because featured images are references to media library items, we need to set an attachment ID, not an image URL. This was something I wasn’t sure was even possible. But a quick search came up with a WordPress support thread with a neat solution: apparently the URL of the image is stored in the guid field of the post in the database. So yo can look up the attachment ID by querying based on that.  Here’s a function to do it from that support thread:

function get_attachment_id_from_src ($image_src) {
  global $wpdb;
  $query = "SELECT ID FROM {$wpdb->posts} WHERE guid='$image_src'";
  $id = $wpdb->get_var($query);
  return $id;
}

This mostly works, but then the first image MIGHT be a resized version of an uploaded image. So we have to cater for those too. When WordPress does its automatic image resizing it adds “-wwwxhhh” to the end of the file name. For example cute-photo-of-cat.jpg becomes cute-photo-of-cat-400×300.jpg. So I added a check for that too that runs if we don’t get an attachment ID from the original image URL. Here’s the final code:

$image_urls = array();
$match_count = preg_match( '/<img alt="" src="["\']([^"\']*)["\'].*" />/', get_the_content(), $image_urls );
if (is_numeric($match_count) && $match_count > 0) {
  $featured_image_url = $image_urls[1];
  echo "Found image with URL: " . $featured_image_url . "\n";
  echo "Here it is:";
  printf("<img alt="" src="\&quot;%s\&quot;" />\n", esc_url($featured_image_url));
  $image_id = get_attachment_id_from_src( $featured_image_url );
  if (! is_numeric($image_id)) {
    // Try removing the image-size
    $match_count = preg_match( '/(.*)-\d+x\d+(\.(jpg|png|gif))/', $featured_image_url, $matches );
    if (is_numeric($match_count) && $match_count > 0) {
      $base_image_url = $matches[1] . $matches[2];
      echo "Image not matched. Trying alternative image URL: " . $base_image_url . "\n";
      $image_id = get_attachment_id_from_src( $base_image_url );
    }
  }
}
if (is_numeric($image_id)) {
  echo "<b>Image matched to attachment with ID: " . $image_id . "</b>\n";
} else {
  echo "No image found.\n";
}

One final thing: You’ll need to take that attachment ID and set it as post meta:

set_post_thumbnail( get_the_ID(), $image_id);

It’s a bit quick-and-dirty, and probably not bulletproof, but worked pretty well in this instance. Perhaps it’s of use to someone else?