Merrimack Student Learns Data Science Through Major League Baseball Statistics
Merrimack junior has taken her love of baseball and applied it to her work studying data science and analytics at the College.
As a student in associate professor Fotios Kokkotos’ Foundations of Data Science class, Hannah Sullivan ’21, analyzed more than 2 million data points gathered by a private company over three seasons of Major League Baseball. Her focus was on analyzing pitching statistics.
Sullivan posed several questions in her study and learned that a pitcher is likely to throw a four-seam fastball 34% of the time when there are runners on first and third base. She discovered that the least common pitches include the screwball and Eephus pitch and that 60.5% of first pitches to batters are strikes. The fastball, change-up, sinker and slider all slow down when the spin rate is higher, she said.
Sullivan provided a detailed 20-page report of her findings to associate professor Kokkotos as part of her required final project in his class. She enjoyed the hands-on learning experience and working with Kokkotos.
“You can tell that professor Kokkotos is so passionate about what he does,” she said. “He wants you to learn and succeed which helps a lot - especially when the work is difficult.”
Sullivan also shared her report with Nick Barese, head coach of Merrimack’s baseball team.
“The level of detail and depth she went into was fascinating,” Barese said.
Analyzing pitching was a natural fit for Sullivan, a huge baseball fan who has charted pitches for Merrimack’s baseball team since her freshman year.
“That’s the reason I love baseball so much,” Sullivan said. “You might as well understand the sport if you’re watching six games a week — and it’s a lot of fun.”
Sullivan, a math major with minors in data science and sport management, took advantage of Merrimack’s increased focus on data science and analytics. Merrimack added a new undergraduate data science major last fall in response to the growing demand for data analysts.
Data science is an interdisciplinary field that has become an important part of modern business, said professor Michael Bradley. Structured and unstructured information that wasn’t available for analysis 20 years ago can now offer insights to nearly all aspects of life and business, including the business of sports.
“We now have massive amounts of data that is collected every time you click online or use your cellphone,” Bradley said. “That data may be valuable for people selling things, advising on health, education or investments. So the question is how to transform the raw data into meaningful information that can improve your life or business in some way.”