Using clusters to better understand and distinguish NFL wide receivers

For the last five years (four of which have been at Dynasty League Football), I have ‘scouted’ NFL Draft prospects at the offensive skill positions, primarily running back and wide receiver. I use the term generously, but almost anyone who watches as much receiver play as I have in that time takes note of the several molds and uses of players at the position. For a long time, I’ve wanted to more formally define these different “types” of receivers. At a glance, it’s interesting an interesting question in its own right (to me at least), and, as I hope to show in future articles, it can tell us more about receiving more broadly. I finally saw that project through this offseason.

Why I used these numbers

My primary objective in this exercise was to identify what types of receivers there are — not how good or bad they may be at something. Thus, my focus was on two things: Receiving usage and physical profile. That reasoning is relatively straightforward. How coaches deploy a receiver tells us a lot about the player in question already: Big-bodied, jump ball receivers are primarily targeted at the sidelines, 10-20 yards downfield; small, quicker-than-fast slot receivers are used within the first 10 yards beyond the line of scrimmage and much more frequently in the middle of the field.

Of course, what make a receiver fall into one category or another are their physical and technical attributes — long speed, ball skills, running after the catch, etc. However, two issues arise. First, sometimes there’s a mismatch between a player’s physical profile and what they actually do on the field; the easiest example that comes to mind is DeAndre Hopkins being one of the league’s most dominant receivers at the catch point, despite measuring 6’1″ (a phenomenon that arises later on in this article!). Second, it’s very difficult to measure those traits; NFL Combine measurements are the closest we get to quantifying any of those traits, but even then, a significant proportion of NFL players lack Combine data, killing the sample size in the process.

Thus, we turn to receiving usage to tell us what a receiver’s greatest skills are, and what type of receiver they are. However, grouping receivers solely by similarity in targets doesn’t completely hit the mark. One example: That approach will look at John Brown and Kenny Golladay as nearly the same player. Tape junkies and fantasy football guys can tell that this is way off; Brown is a small, explosive deep threat, and Golladay is a large adult man who reliably beats corners at the catch point (though he is one of the league’s more dynamic big men as well).

I have good news. There is one set of traits that is universally measured: Size. Every NFL player has a listed height and weight, and as it turns out, that’s the missing piece to the puzzle. With data concerning receivers’ usage and size, we can tell Wes Welker apart from Tyreek Hill (similar sizes) and Mike Williams apart from Torrey Smith (similar usage).

The math-heavy stuff

As for the actual process of categorizing these players, I used k-means clustering to partition each receiver with at least 100 targets logged from 2009-2019 into separate clusters. Still, two questions remain: What exact data was used, and how many clusters to separate players into.

For usage inputs, I first used data gained from nflscrapR to tell where a given target came from. I categorized the depth of a target from its air yards either as “screen” (<=0 yards), “short” (1-7 yards), “intermediate” (8-19 yards), or “deep” (>=20 yards). I also simplified the target location from “left”/”middle”/”right” to “outside”/”inside”. Combine the depth and location categories and you get eight “octants” of the field where a target may come from. For example, here’s a simplified “heat map” of Michael Thomas’s 2019 targets:

Imagining Thomas lines up on the left boundary for every snap helps me make sense of this table.

From there, I fed the proportion of a given player’s targets that went to each octant into the k-means algorithm. In turn, I was able to summarize players’ target data with relatively few variables and little information lost.

Selecting which size variables to use (height, weight, and/or BMI) took a little more time. To make that decision, I created clusters of several different sizes with a few different combinations of those variables, identified the best cluster sizes for a given combination of variables (using the elbow and silhouette methods, and some subjectivity to break near-ties, in turn solving the second question posed). I then (subjectively) compared these best-performing sets of clusters and chose what I thought to be most sensible (which cluster had the most logical/least illogical groupings) and informative (low-k clusters were very sensible, but only really distinguish between slots and split ends) between the various combinations of variables.

The decision came down to whether to use height and BMI or just height. Both options gave some slightly strange results, but ultimately, I chose to stick solely with height. In both cases, seven appeared to be the optimal k value, with five or six very sensible clusters and one or two quirky ones.

With height and BMI, there was a tiny group, best described as “thin split ends.” To me, the most notable members of that group were John Brown, Sidney Rice, and Michael Gallup. If we are to get so specific as to have seven different categories, I don’t think that Brown and Gallup should be in the same group, let alone be the defining members of it.

Meanwhile, with just height, everything looked about right, except short slot-type receivers were fragmented into two groups, basically only separated by how frequently they were targeted on screens. Strangely enough, eliminating a cluster and using a k value of six condensed two other clusters, meaning the distinction between high-screen usage and low-screen usage between small slots was still there. Thus, simply moving back to a k value of six wouldn’t help.

Ultimately though, the surprising distinction between short slots makes more sense than the “thin split ends,” and it’s a lot easier to work around — I can simply merge the two slot clusters myself, which I’ll toy with in further analyses. I also had to choose a factor to scale size down by (standardizing all the variables would give really strange results, as it’d make the distinctions between pop-pass and deep-shot frequency the same), which was done subjectively, though that wasn’t very difficult — there was a very thin range of scales that were dominated by neither usage nor height.

At the end of the process, I’ve come away with seven different clusters for NFL receivers, using players’ target-zone proportions and heights.

Results (“the fun part”)

Now that I’ve covered how these results came about, let’s dig into the groupings that the clustering algorithm came away with. Here’s a table describing each cluster, including the name I’ve attributed to it, its size, common traits of its receivers (in terms of usage and size), prototypical and other players of note within the cluster:

(Click the image to see a bigger, clearer version of the table)

Let’s take a glimpse at each of these groupings.

Cluster 1

Adot = Average depth of target, Sddot = Standard deviation (variability) of depth of target — both logged, Outside pct = Percentage of targets to come from the left or right side of the field

Members of our largest class of wideouts encompass the broad category of “tall players often targeted on short passes.” But within the group are two distinguishable sorts of receiver: the big slot, and the shallow generalist. Guys like Keenan Allen and Marques Colston occupy the former category, with a much higher volume of targets coming from the middle of the field, while Amari Cooper, Larry Fitzgerald, and others see a wider (though not deeper) spread of targets. Davante Adams and Juju Smith-Schuster are interesting members; indeed, despite low average depths of target, both see much more volatility in their target depths, and thus are near the fringes of the group.

Cluster 2

Generally speaking, these guys combine some of the best speed of anyone at the position with enough size to consistently earn throws far down the field and outside the hash marks. I say generally because a few guys — notably DeAndre Hopkins and Michael Gallup — are slower exceptions who mainly get those looks as a result of their physicality and catch-point prowess instead. We see most of those players fall into cluster 4 or 5, but these two are significantly shorter than usual (both 6’1″). For years, this anomaly has made Hopkins’ receiving excellence even more impressive; it also suggests that Gallup’s performance should be more appreciated. More archetypical players include Ted Ginn, Kenny Stills, and John Brown, some of the league’s most explosive (and perhaps one-dimensional) targets.

Cluster 3

The smallest cluster, and maybe the easiest for us to naturally identify. These are the little slot receivers who see way more targets between the line of scrimmage and the secondary: The group saw about 15% more of its targets go between 0-7 yards than those of all other receivers. We will see cluster 6 with a similar level of short-field usage, but that group saw about 18% of its targets come from screens (easily the most of any) while cluster 1 receivers typically saw just 7-11% come from screens. Future analyses could be benefited by grouping these clusters together, but I don’t see a need to do so now.

Another point of interest: Michael Thomas is, surprisingly, categorized here among the likes of Cole Beasley and Willie Snead. In some ways — taller and targeted outside more frequently than almost every other receiver — he is at the very edge of the group, as one would expect. However, his targets were even shallower, with lower volatility, than this group on average. Four seasons into the NFL, the combination of Thomas’s size and usage has been exceptional, and probably not in a good way. While I would disagree with the notion that he’s the best receiver in the league, he is one of its best, and his skills could be much better utilized on more challenging, higher reward plays. Football scheme and route concepts aren’t nearly simple enough to just say “throw it deeper to him,” but the Saints should be able to find a decent approximation of his shallow abilities for cheap, and then plug him into a further-downfield role.

Cluster 4

Don’t take the inclusion of players like Terrell Owens and Randy Moss as gospel. These categorizations and data come only from their data from 2009 and after.

This cluster is a mix of bigger, athletic dudes and cluster 5-like (but more varied) sideline targets. Julio Jones, Kenny Golladay, and Sammy Watkins (despite injuries and inconsistencies) blend size, speed, and skill as well as just about anyone in the league, and earn targets all over the field for that reason. Meanwhile, AJ Green and Alshon Jeffery aren’t quite as explosive, but they have more versatility to their games than someone like Mike Williams or Dwayne Bowe. Each type of cluster 4 receiver is usually targeted outside at a higher rate and sees about as many deep balls as any receiver (both cluster 2 and cluster 4 have about 22.5% of targets come on deep shots, on average). With names like Jones, Green, and Calvin Johnson, it has routinely held some of the league’s best receivers.

Cluster 5

Another very easily-identified grouping. These guys separate to varying degrees of success, but they’ve got size, and they know how to use it. They are kings of the intermediate target: Each was targeted between 8 and 20 yards as much as, or more than, the average of all receivers. To those wondering why: This depth is where most back-shoulder fade targets tend to occur (hence why I refer to them as ‘sideline’ receivers even though several see plenty of middle-field targets). Every receiver runs a wide variety of routes, but for these players, a huge proportion of them are fades. For that reason, if we take the log of receivers’ depths of target (reasoning that the difference between a 17 and 19-yard target is negligible), these receivers have the lowest standard deviation of any cluster — even cluster 3.

While Evans and Vincent Jackson easily come to mind, Chris Godwin’s inclusion may come as a surprise. That surprise is warranted: He’s one of the group’s smallest receivers, with a normal spread of target depth, and even sees a lot of targets down the middle. Really, his usage has defied any single label thus far, and he falls into this categorization primarily because he’s still seen a lot of throws to the intermediate part of the field. Perhaps time will help us categorize him better.

Cluster 6

The second half of our series of short slot receivers. They just generally seem to be strong in the screen game as well, leading to the previously mentioned disparity between cluster 3 and 6’s screen target frequencies. And that starts to make sense when we look at the prototypical players in each category: Jarvis Landry and Golden Tate are a lot more dynamic than Cole Beasley or Willie Snead. Cooper Kupp and Tyler Boyd are the group’s two tallest notable receivers, both of which likely earn this cluster’s designation (instead of cluster 1’s) primarily due to high screen (both over 15%) and middle-field target usage.

Cluster 7

Small blazers and other little guys coaches like to get the ball to at every level of the field. They’re used in the screen game more than any cluster outside except cluster 6 and on deep shots more than anyone outside of clusters 2 and 4. As a result, these players have the most volatile depths of target of any cluster. Headlined by Tyreek Hill, Odell Beckham, and Antonio Brown, these receivers are more dynamic than cluster 6 and shiftier (and, perhaps, smaller) than cluster 2. Most notable members of this group are younger, and with intriguing blazers like Henry Ruggs, Jaylen Waddle, and Rondale Moore not far from NFL action, we should see several more arrive soon.

What do we do with this?

This exercise opens the door to a lot of really intriguing analyses: What kind of receivers are best for fantasy football? What kind of receivers are most replaceable? Most irreplaceable? What kind of receivers do NFL teams value most? When are different kinds of receivers drafted? How has the NFL’s proportional makeup of different types of receivers changed over time? We can use these clusters to help answer each of these questions and more.