How to Analyze Formula 1 Data with Python: A Practical Example

Mine Aktaş
2 min readAug 2, 2023

--

Formula 1, with its thrilling races and cutting-edge technology, captivates millions of fans worldwide. As a Ferrari fan, the Bahrain Grand Prix holds a special place in my heart. In this article, we will learn how to analyze the data of the 2022 Bahrain Grand Prix using Python.

Step 1: Data Collection

Firstly, we will use the ergast API to access the results and race data of the Bahrain Grand Prix. This API provides easy and free access to historical Formula 1 data.

Step 2: Setting Up the Python Environment

Assuming Python is installed, let’s install the necessary libraries such as “requests” and “pandas.”

import requests
import pandas as pd

Step 3: Making API Calls

Let’s make a simple API call to retrieve the results of the 2022 Bahrain Grand Prix.

url = "https://ergast.com/api/f1/2022/1/results.json"
response = requests.get(url)
data = response.json()

Step 4: Data Preprocessing

The obtained data will be in JSON format. We need to convert the data into a pandas DataFrame for further processing and analysis.

race_results = data["MRData"]["RaceTable"]["Races"][0]["Results"]
df = pd.json_normalize(race_results)

Step 5: Data Analysis

Now that we have the data in a DataFrame, we can perform various analyses. For example, let’s find the top 5 finishers of the race.

top_five_results = df.loc[:, ["position", "Driver.givenName", "Driver.familyName", "Constructor.name"]]
top_five_results = top_five_results.head(5)
print(top_five_results)

Step 6: Visualization

Visualizing the data helps in better understanding. Let’s create a bar chart to visualize the points earned by each driver in the race.

import matplotlib.pyplot as plt

driver_points = df.loc[:, ["Driver.givenName", "Driver.familyName", "points"]]
driver_points = driver_points.sort_values(by="points", ascending=False)

plt.bar(driver_points["Driver.givenName"] + " " + driver_points["Driver.familyName"], driver_points["points"])
plt.xlabel("Driver")
plt.ylabel("Points")
plt.title("Points Earned by Drivers in the 2022 Bahrain Grand Prix")
plt.xticks(rotation=45, ha="right")
plt.tight_layout()
plt.show()

Output of the Code

Output of the code

Conclusion

In this article, we learned how to analyze the data of the race using Python. The Bahrain Grand Prix holds a significant place in the Formula 1 calendar, and we utilized Python’s powerful libraries to analyze its data. These basic analysis and visualization steps are a good starting point to explore Formula 1 data more comprehensively. So, continue exploring the exciting world of Formula 1 using data and enjoy the thrill of this captivating sport!

--

--