Before diving into any of the several topics of discussion and analysis regarding my previous work in clustering NFL receivers into “receiver types,” I realized I would first have to figure out just how many receivers could be reliably categorized. In addition to player size, my clustering model uses player target depth and location (per nflfastR) to place receivers into different categories. This meant I needed to find how many targets it took in order to feel reasonably confident in a player’s calculated clustering over those targets. The procedure I’d use was simple and not hard to think of: I’d take a running summary of every receiver, target-by-target, and place them into a “running cluster” based on that information, and then see how quickly those running clusters started to converge into receivers’ final clusters. I even had a graph in mind, and indeed, I made it:
It seemed like a boring, but necessary, intermediate step. Initially, I was disappointed when viewing the results; the 100-target mark wasn’t as meaningful as I thought it’d be, and it took 500 targets to get to about an 80% confidence level. However, these clusters are purely a construct — they never should have been expected to be that stable. In football sense, receivers play the split end, slot, and flanker roles — not the “cluster five,” or whatever — and even then, they frequently switch off between those roles several times in any given game, let alone how their usage can change with new scenery, new personnel, or a new scheme over longer stretches. In hindsight, it makes a lot of sense that these groupings don’t stabilize particularly quickly, and looking into what factors drive that instability makes for an interesting analysis all on its own. Today, that’s what I’ll be doing.
Let’s get the necessary part out of the way first. To reiterate, we’re looking into how many targets it takes to reliably use a receiver’s usage to categorize him using the K-Means clustering algorithm. (Since my original project, I’ve considered applying other algorithms to answer the same basic question, but I’m satisfied enough with the clusters I have currently to move forward. That being said, those algorithms could be used to better answer some specific, related questions, such as using DBScan to identify some of the NFL’s more unusual receivers.) From there, I would filter out receivers with fewer career targets than whatever proper threshold I found from a given analysis.
I felt a simple investigation of “What proportion of guys are properly categorized at any number of targets?” sufficed to find that threshold. Again, here is the running total of a receivers’ targets plotted against the proportion of receivers to be clustered through x targets in the same category as their final clustering:
My main takeaways from the plot are reflected in the subhead: There’s a roughly logarithmic association between career targets and “correct” classification proportion. (The use of “correct classification” here is meant as an easy shorthand for the aforementioned proportions, not to be confused with the correct classification rate of a supervised learning model.) We see that the first 100 targets have given us a reliable categorization for 60-65% of the roughly 270 receivers in the data set. Big-time rookie producers can meet that threshold in a single season — Justin Jefferson hit 125 targets in 2020.
Additional targets start to see diminishing marginal returns, in terms of increased correct classification, from there: It takes 100-125 more targets to knock that proportion up to 70%. Most half or full-time receivers reach or approach that threshold by the end of year four. From there, we see an interesting dip in the proportion of correct classification; this could simply be noise in the data, or perhaps it could be the effect of several receivers finding themselves in new roles on new teams at the end of their four or five-year rookie contracts. Finally, after 500 targets, NFL veterans are correctly classified about 80% of the time.
Due to the aforementioned artificiality of my seven receiving clusters, there is a fair amount of overlap between them. The table below quickly outlines each of those clusters:
Most notably to me, there’s significant overlap between clusters 3 and 6, but still, there are a lot of connections between each of the classes, which will be dug into later on. Because of these overlaps, some more notable than others, some clusters may be harder to correctly classify. I re-created the prior graph while also splitting receivers by their final categorization to see if the data bore that out:
In this case, we see the lines devolve into step functions much more quickly than before we had segmented the data, further indicating that any takeaway should take small sample sizes into strong consideration. With that said, we observe some considerable discrepancies in cluster stabilities between the clusters. The explosive smaller receivers of clusters 2 and 7 fall in line pretty quickly, as do the bigger, multipurpose athletes of cluster 4. Conversely, big slots, screen-heavy small slots, and back-shoulder specialists (clusters 1, 6, and 5, respectively) were harder to identify. The reliability of cluster 4 surprised me, as those types of do-it-all players (ex: Sammy Watkins, Julio Jones) could very easily be placed into new roles in new situations, as they’d probably be able to fulfill them. And on the other hand, the uncertainty of cluster 5 wasn’t expected, as I traditionally view those players as more one-dimensional, clearly able to win at few spots on the field.
My next question was, for each of these clusters, where were receivers ending up compared to their running cluster at a given point in time? To answer that query, I first looked at how many receivers wound up staying in their running cluster, and where the remaining players in that running cluster ended up. The corresponding graphic:
There’s a lot of information to take in here, but I’ll stick to the main takeaways. First, we see that at least 70% of receivers stick to their running cluster at 225 targets — except for cluster 3. That grouping loses almost half of its members to clusters 1 and 6. The overlap between 3 and 6 (again, the two clusters for short slot receivers) is obvious. A bit more unexpected is the players categorized as short slots re-grouping into the big slot category. However, a more subtle distinction between these two groups is that cluster 1 tends to see more sideline targets, compared to high activity between the hashes for cluster 3. For mid-size players, it makes sense that they could switch over between those groupings. Also, that 22% number comes from exactly four players (Patrick Crayton, Jerricho Cotchery, Jason Avant, and Marques Colston) — the third cluster is so small that it’s particularly vulnerable to small-sample noise.
Receivers who look to belong to either cluster 4 or 5 behave as you’d probably expect, only moving between those two groups and the big slot group of cluster 1. Otherwise, we also observe a few players transition from the screen-heavy small slot group (cluster 6) to cluster 1. That trio consists of TJ Houshmandzadeh, Anquan Boldin, and JuJu Smith-Schuster. JuJu’s inclusion in cluster 6 can be chalked up to his heavy screen usage, while Housh and Boldin’s usage seemed to shift more to the outside as time wore on.
Another trend you may have noticed is that there’s some considerable blue in just about every column. Clearly, cluster 1 receivers can be categorized as anything earlier on. I flipped the above graph on its head to get a clearer picture of that phenomenon:
Ignoring the small-sample magic of cluster 3, we see cluster 1, 5, and 6 receivers were harder to identify early into their careers. Cluster 1 actually had at least one player from each running cluster make up its final membership. (If you’re curious like me, the one player to go from Tyreek Hill’s cluster 7 to Keenan Allen’s cluster 1 was Paul Richardson.) Cluster 5 gets a chunk of its receivers from cluster 4, and vice-versa.
The three receivers to move from the explosive short guy cluster 7 to the slippery short guy cluster 6 — Santana Moss, Eddie Royal, and Golden Tate — tell a satisfying story: explosive short and long threats who may have lost a step, but still have some juice, re-inventing their games and becoming slippery short options. The one player to make the opposite move also makes sense: Tyreek Hill was utilized more as a gadget piece in his first year or two in Kansas City before becoming the nightmare at every level of the field that he is today.
Still, there are so many slivers of different colors across that graph, each case indicating unique players who strangely moved from one cluster to a seemingly-unrelated final cluster. Sometimes, there are interesting stories to tell, and sometimes they simply buck the trends of the league and thus don’t conform well to such a clustering analysis. Let’s dig into those guys.
If you looked closely at that cluster table earlier, you might have noticed two conspicuous names in cluster 2. Next to the likes of Kenny Stills, John Brown, and Ted Ginn: DeAndre Hopkins and Michael Gallup. Hopkins and Gallup? These guys aren’t burners by any means, what’s the deal here? I picked up on that back in my first article:
Generally speaking, [cluster 2 receivers] combine some of the best speed of anyone at the position with enough size to consistently earn throws far down the field and outside the hash marks. I say generally because a few guys — notably DeAndre Hopkins and Michael Gallup — are slower exceptions who mainly get those looks as a result of their physicality and catch-point prowess instead. We see most of those players fall into cluster 4 or 5, but these two are significantly shorter than usual (both 6’1″). For years, this anomaly has made Hopkins’ receiving excellence even more impressive; it also suggests that Gallup’s performance should be more appreciated.
In other words, by skill set and usage, Hopkins and Gallup should be categorized alongside Mike Evans and Vincent Jackson — all of these players earn(ed) tons of targets downfield on the sideline, and it usually wasn’t because they were dusting guys. However, Nuk and Gallup are just so much shorter that the clustering algorithm essentially looks at them and thinks, surely, these must be two more not-tall fast guys. This outcome is a testament to Hopkins and Gallup’s remarkable ball skills; Hopkins has received plenty recognition for this, but I think Gallup deserves a lot of additional respect for this positioning as well. Neither of these players fit too neatly into any one of these categories by traditional measures, and to me, their ability to excel nonetheless is pretty cool!
While on the topic and that specific cluster, Calvin Ridley is fast, but he’s not really a go-route specialist. Still, he earns usage that resembles that of Stills & co. because he is so effective as a route runner (with speed, quickness, agility, and technique), that he can get open at the sideline, 10-plus yards down the field, at will. Ridley’s status in NFL discussions is higher than Gallup’s, but like the Cowboy, Ridley’s misleading membership in cluster 2 serves as strong evidence that he’s a truly special player; not only is he good, he’s really pretty great.
Next, a simple one: Randy Moss transitioned from the jump ball-happy cluster 5 to the do-it-all cluster 4. In case you weren’t aware, Moss had a pretty good and pretty productive career, spanning over a decade. He became an icon for Moss-ing defensive backs again and again as a downfield, jump-ball receiver, and he was dynamic enough to separate against defenders at any area of the field in other cases. He could either be Julio Jones or Mike Evans, depending on what you needed. A pretty special player.
Stefon Diggs can be celebrated in a similar matter. He started his career in the downfield-heavy cluster 2, moved on to the more versatile cluster 7 for most of his career in Minnesota, and became a “big slot” cluster 1 receiver in Buffalo. Really, he moved into cluster 1 due to his high short-field, sideline usage. For a lot of players, that would happen because there are a lot of low-difficulty targets in that area of the field. However, watch virtually any of his 2020 games, and that’s clearly not the case for Diggs. I feel confident saying he’s the league’s best separator, and his excellence in both that role and his much-dissimilar role in Minnesota speaks to his greatness. (To me, the debate is between him and Tyreek Hill for the league’s most valuable receiver in 2020-21.)
Now, to stop gushing, and quickly cover some other contemporary players you may be curious about. Odell Beckham has spent time in cluster 7, cluster 2, and cluster 1. Chris Godwin has spent essentially his entire career jumping between clusters 1 and 5. (He currently sits in cluster 1.) Corey Davis has done the same, mixing in some time spent in cluster 4, though he’s currently slotted into cluster 5. JuJu Smith-Schuster can’t decide whether he wants to be in cluster 1 or 6.
However, this isn’t the only collection of players to move around between clusters later on in their careers — these were simply the players whose games or lists of aptitudes never fit neatly into a single category. The next set of players tirelessly changed and refined their craft as their careers progressed.
That word I used earlier about Moss, Royal, and Tate — reinvention — is a great branching topic from this analysis. What players changed their games as they aged? Was it them adapting to their changing skills, a transforming league, or simply a new situation?
Let’s start with this century’s classic example of reinvention at the receiver position. One hundred years ago, Larry Fitzgerald entered the NFL as a big, explosive, 4.48 second 40-yard dash runner. He could body defensive backs as a Pitt Panther, or simply run past them. One of college football’s greatest receivers became one of the NFL’s best as well. With 1400 targets to his name, Fitzgerald profiled as a sideline-heavy jump-ball dominator for most of his career to that point. All of this information, save for his success, came as a shock to me.
That is because in the second half of his career, he transitioned his game unbelievably well to account for his declining speed, his lost size advantage (due to incoming players getting bigger and bigger), and the spread revolution in NFL offenses. Around target 1450, he moved over to the big slots of cluster 1, the role which we know him for much better today. (The actual transition occurred much earlier in his career, as it had to more than offset his early-career usage.) Equally impressive to the natural talent of someone like Randy Moss is Fitzgerald’s mix of physical gifts, exceptional technique, tireless work ethic, and pure intelligence. (Moss had each of those things as well, they just weren’t quite as necessary to his form of long-term success.) And, just as we may never see someone so physically ahead of their time like Moss again, we may never see such a perfect blend of athleticism, skill, and workmanship as Fitzgerald’s.
Brandon Marshall made a nearly-opposite transition to Fitzgerald’s. In his early Broncos years, Marshall ate up defenses with a relatively short receiving game, putting him into cluster 1. After moving on to Miami and later re-uniting with Jay Cutler in Chicago, Marshall became much more of the downfield-focused target we (or I, at least) remember him as today. At 6’5″ and with his easily-displayed skills at the catch point, his early-career usage was honestly a bit surprising.
Fellow Bronco draftee Demaryius Thomas saw his usage change over time as well, though for more disappointing reasons. In his early years receiving passes from Kyle Orton, Tim Tebow, and prime Peyton Manning, Thomas utilized his Julio Jones-like blend of size and speed to catch passes all over the field, slotting neatly into cluster 4. (I’ll keep myself from talking here about how Thomas is one of the last decade or two’s most underrated receivers.) However, time passed by, and Manning’s arm lost some of its pop. That decline, followed by years playing with the Broncos’ revolving door of post-Manning quarterbacks, forced him into a shorter role in an effort to ensure easy completions for the team’s lesser quarterbacking talent. In turn, Thomas profiled much more as a cluster 1 big slot type.
But most notable of all was Wes Welker’s unforeseen turn from the small slots of cluster 3 to the screen-heavy small slots of cluster 6 upon entering New England. Ok, actually, I’m being told no one cares.
The primary purpose of this analysis was to figure out when receivers’ cluster assignments start to stabilize, and thus when those assignments become trustworthy enough to keep in studies that utilize those clusterings. For now, my operating rule will be to use receivers with 500-plus-target receivers when I need a high level of confidence in these players’ groupings, 225-plus-target receivers when I want clusters to be reasonably accurate but also want a decent sample size, and to use receivers with between 100 and 225 targets to look primarily into what we can expect from those same receivers in coming years.
Also, there were a lot of interesting nuggets to dig into that stemmed from the original objective. In particular, I think this exercise was significant in helping flesh out an under-discussed topic of receivers’ greatness — how are or were they great? Where did they win, and where did teams ask them to do so? Were they simply faster than everyone (Tyreek Hill)? A generational combination of size, athleticism, and technique who was so far ahead of their time that they could do whatever they wanted (Randy Moss)? Someone so good at the catch point that it didn’t matter that he actually wasn’t that big in the first place (DeAndre Hopkins)?
I set out clustering these players originally with the explicit intent not to use player skills, or proxies of those skills, as criteria for clustering. For that reason, it’s really interesting to see how some players still managed to stand out and show how special they are.