GIS Meets March Madness: Using Spatial Data to Analyze Basketball Team and Player Performance

March 16, 2012  - By

Even if you aren’t a basketball fan, you’ve likely heard the term “March Madness” over the years. It refers to a time when the best U.S. college basketball teams compete for the championship title. Demonstrating the diversity of GIS, a Harvard University professor has introduced an interesting method of analyzing basketball team and player performance using GIS spatial analysis techniques.

At the MIT (Massachussets Insitute of Technology) Sloan Sports Analytics Conference 2012 (March 2-3, 2012), Harvard Professor Dr. Kirk Goldsberry presented Court Vision, “a new esemble of analytical techniques designed to quantify, visualize, and communicate spatial aspects of NBA performance and unprecedented precision and clarity.”

Dr. Goldsberry argues that conventional performance metrics, such as shooting percentage, ignore spatial information. This is odd, Dr. Goldsberry explains, because basketball is a spatial sport. For example, the NBA players with the top shooting percentages are all forwards or centers, who typically shoot from shorter distances than players in the guard position. Without analyzing the spatial shooting tendencies, key scoring phenenom remain misunderstood and coaches and players are missing out on an opportunity to accurately analyze and refine their strategies.

Who’s the Best NBA Shooter?

“Data: Using game data sets for every NBA game played between 2006 and 2011, we compiled a spatial field goal database that included Cartesian coordinates (x,y) for every field goal attempted in this 5-year period. This data set includes player name, shot location, and shot outcome for over 700,000 field goal attempts. We mapped the shot data atop a base map of a NBA basketball court (Figure 1). Although a regulation NBA court is 4,700 ft2, (50ft x 94ft), almost all (>98%) field goal attempts occur within a 1,284 ft2 area in between the baseline and a relatively thin buffer around the 3-point arc; we call this area the “scoring area.” We divided the scoring area into a grid consisting of 1,284 unique “shooting cells,” each 1 ft2 (Figure 1). To quantify shooting range, we applied spatial analyses to evaluate shooting performance across the grid and within each shooting cell.”

NBA field goal attempts 2006-2011 (Source: Dr. Kirk Goldsberry).

NBA field goal attempts 2006-2011 (Source: Dr. Kirk Goldsberry).

NBA field goal attempts (Source: Dr. Kirk Goldsberry)

NBA field goal attempts (Source: Dr. Kirk Goldsberry)

“Our composite shot maps from 2006-2011 NBA game data. The first map summarizes the density of all field goal attempts during the study period. The second map reveals league-wide tendencies in both shot attempts and points per attempt. Larger squares indicate areas where many field goals were attempted; smaller squares indicate fewer attempts. The color of the squares is determined by a spectral color scheme and indicates the average points per attempt for each location. Orange areas indicate areas where more points result from an average attempt, and blue areas indicate fewer points per attempt.”

“We derived metrics that described spatial aspects of shooting performance throughout the scoring area. The most basic metric is called “Spread,” which is simply a count of the unique shooting cells in which a player has attempted at least one field goal. The raw result is a number between 0 and 1,284 and summarizes the spatial diversity of a player’s shooting attempts. By dividing this count by 1,284 and multiplying by 100, we generated Spread%, which indicates the percentage of the scoring area in which a player has attempted at least one field goal.”

“Spread describes the overall size of a player’s shooting territory. League leaders in FG% generally have a small Spread value since they tend to only shoot near the basket. For example, since centers generally thrive in limited areas near the hoop they tend to have lower Spread values than shooting guards. Kobe Bryant has the highest spread value in the NBA (table 1); Bryant’s value of 1,071 indicates he has attempted field goals in 1,071 of the 1,284 shooting cells or 83.4% of the scoring area. In contrast, Dwight Howard has attempted field goals in only 23.8% of the shooting cells. Although Spread% favors players who simply shoot frequently, it also reveals that some players like Dwight Howard who do shoot a lot, only do so in limited court spaces. For example, Al Jefferson attempted 400 more field goals than Ray Allen during the study period, yet his Spread value is only 595 (46.3%), while Ray Allen’s is 952 (74.1%). Visual depictions of the spread variable expose the stark differences in individual players’ spatial shooting behaviors. Via the graduated symbol cartographic technique, figure 2 reveals the spatial structure of Al Jefferson and Ray Allen’s field goal attempts during the study period. Jefferson is highly active in the central areas near the basket, and clearly favors posting up defenders on the right side of the court. Meanwhile, Ray Allen is highly active behind the 3-point arc; he attempts many 3-point field goals, but is relatively inactive from mid-range areas.”

Spread variable for Al Jefferson (Source: Dr. Kirk Goldsberry)

Spread variable for Al Jefferson (Source: Dr. Kirk Goldsberry)

Spread variable for Ray Allen (Source: Dr. Kirk Goldsberry)

Spread variable for Ray Allen (Source: Dr. Kirk Goldsberry)

“These Spread visualizations reveal a player’s basic shooting tendencies, but tell us nothing about potency. Shooting skill requires more than just attempts; the best shooters in the league are able to make baskets at effective rates from many court locations. To describe the spatial potency of players we created a metric called “Range,” which is a count of the number of unique shooting cells in which a player averages at least 1 point per attempt (PPA). PPA varies considerably around the court. As anyone who has ever shot a basketball knows, the probability of a shot attempt resulting in a made basket is spatially dependent; some shots are easier than others, and some players are unable to shot effectively from most court locations. Range accounts for spatial influences on shooting effectiveness. It is essentially a count of the number of shooting cells in which a player averages more than 1 PPA; we chose PPA over FG% because it inherently accounts for the differences between 2-point and 3-point field goal attempts.”

“By dividing this count by 1,284 and multiplying by 100, we generated Range%, which indicates the percentage of the scoring area in which a player averages more than 1 PPA. Steve Nash is ranked first. He has a Range value of 406, indicating that he averages over 1 PPA from 406 unique shooting cells, or 31.6% of the scoring area. Ray Allen was ranked second (30.1%), Kobe Bryant (29.8%) was third, and Dirk Nowitzki (29.0%) was fourth (table 2). Figure 3 visualizes the shooting range of these four players.”

“Steve Nash has the highest Range% in our case study, but does this mean he is the best shooter in the NBA? That obviously remains debatable; however it is certain that over the last few NBA seasons, Nash and Ray Allen are the most effective shooters from the most diverse court locations. The average shooter in the NBA has a Range% of 18.5, meaning they score efficiently from 18.5% of the scoring area. Nash and Allen are the only two players in the league whose Range% values exceed 30%; only a handful of players in the league average more than 1 PPA from at least 25% of the scoring zone (table 2), and unsurprisingly, despite being among the leaders in FG%, Dwight Howard (Range% = 6.5) and Nene Hilario (Range% = 3.7) are not on that list. Whether the Range% metric is the best way of quantifying shooting range or not, it seems to capture pure shooting ability better than FG% or eFG%.”

The following images depict the shooting ranges of Steve Nash, Ray Allen, Dirk Nowitzki, and Kobe Bryant. According to Dr. Goldsbery, “these four players had the highest range values, but these graphics reveal that they achieve them in much different ways. For example, when compared to the three others, Dirk Nowitzki shoots relatively few 3-point shots and performs much better in the mid-range areas on the left side of the court, while Ray Allen excels in the corners of the court where Steve Nash rarely shoots.”

Steve Nash shooting range (Source: Dr. Kirk Goldsberry)

Steve Nash shooting range (Source: Dr. Kirk Goldsberry)

Ray Allen shooting range (Source: Dr. Kirk Goldsberry)

Ray Allen shooting range (Source: Dr. Kirk Goldsberry)

Dirk Nowitzky shooting range (Source: Dr. Kirk Goldsberry)

Dirk Nowitzky shooting range (Source: Dr. Kirk Goldsberry)

Kobe Bryant shooting range (Source: Dr. Kirk Goldsberry)

Kobe Bryant shooting range (Source: Dr. Kirk Goldsberry)

“Steve Nash has the highest Range% in our case study, but does this mean he is the best shooter in the NBA? That obviously remains debatable; however it is certain that over the last few NBA seasons, Nash and Ray Allen are the most effective shooters from the most diverse court locations. The average shooter in the NBA has a Range% of 18.5, meaning they score efficiently from 18.5% of the scoring area. Nash and Allen are the only two players in the league whose Range% values exceed 30%; only a handful of players in the league average more than 1 PPA from at least 25% of the scoring zone (table 2), and unsurprisingly, despite being among the leaders in FG%, Dwight Howard (Range% = 6.5) and Nene Hilario (Range% = 3.7) are not on that list. Whether the Range% metric is the best way of quantifying shooting range or not, it seems to capture pure shooting ability better than FG% or eFG%.”

To view Dr. Goldsberry’s complete paper, click here.

Thanks, and see you next week.

Follow me on Twitter at http://twitter.com/GPSGIS_Eric

This article is tagged with and posted in GSS Monthly, Opinions

About the Author: Eric Gakstatter

Eric Gakstatter has been involved in the GPS/GNSS industry for more than 20 years. For 10 years, he held several product management positions in the GPS/GNSS industry, managing the development of several medium- and high-precision GNSS products along with associated data-collection and post-processing software. Since 2000, he's been a power user of GPS/GNSS technology as well as consulted with capital management companies; federal, state and local government agencies; and private companies on the application and/or development of GPS technology. Since 2006, he's been a contributor to GPS World magazine, serving as editor of the monthly Survey Scene newsletter until 2015, and as editor of Geospatial Solutions monthly newsletter for GPS World's sister site Geospatial Solutions, which focuses on GIS and geospatial technologies.