Statsbomb
- class Event_data(data_provider='statsbomb', event_path=event_path, sb360_path=sb360_path, statsbomb_match_id=statsbomb_match_id, statsbomb_api_args=statsbomb_api_args, max_workers=max_workers).load_data()
Load StatsBomb event and 360 data from files or API. (currently the API freezeframe data is not implemented)
- Parameters:
event_path (str) – Path to the JSON file containing the event data.
sb360_path (str, optional) – Path to the JSON file containing the 360 data.
statsbomb_match_id (str, optional) – Match ID to fetch the event data from the StatsBomb API.
statsbomb_api_args (tuple) – Additional arguments to pass to the StatsBomb API.
max_workers (int, optional) – Maximum number of workers to use for parallel processing.
- Returns:
DataFrame containing the consolidated event and freeze frame data.
- Return type:
pd.DataFrame
Example usage for single match:
import pandas as pd from preprocessing import Event_data event_path = 'path/to/event.json' sb360_path = 'path/to/360.json' match_id = '12345' statsbomb_api_args=['creds = {"user": "input your Statsbomb api user name here", "passwd": "input your Statsbomb api password here"}'] # Load event data from file statsbomb_df=Event_data(data_provider='statsbomb',event_path=event_path).load_data() # Load event and 360 data from files statsbomb_df=Event_data(data_provider='statsbomb',event_path=event_path,sb360_path=sb360_path).load_data() # Load event data from API statsbomb_df=Event_data(data_provider='statsbomb',statsbomb_match_id=match_id,statsbomb_api_args=statsbomb_api_args).load_data() print(statsbomb_df.head())
Example usage for multiple matches:
import pandas as pd from preprocessing import Event_data event_folder = 'path/to/event/folder' sb360_folder = 'path/to/360/folder' match_id = ['12345','67890'] statsbomb_api_args=['creds = {"user": "input your Statsbomb api user name here", "passwd": "input your Statsbomb api password here"}'] # Load event data from file statsbomb_df=Event_data(data_provider='statsbomb',event_path=event_folder).load_data() # Load event and 360 data from files statsbomb_df=Event_data(data_provider='statsbomb',event_path=event_folder,sb360_path=sb360_folder).load_data() # Load event data from API statsbomb_df=Event_data(data_provider='statsbomb',statsbomb_match_id=,statsbomb_api_args=statsbomb_api_args).load_data() print(statsbomb_df.head())
Details:
This function loads StatsBomb event data and optionally 360 freeze frame data from specified file paths or API. The event_path is mandatory, while sb360_path and match_id are optional. The function processes these JSON files or API data and combines their data into a single pandas DataFrame through the following steps:
Load JSON files: Parse the event and 360 JSON files if provided.
Fetch data from API: If match_id is provided, fetch the event data from the StatsBomb API.
Convert 360 data into a DataFrame: - Extract freeze frame data for teammates and opponents. - Flatten and pad the list to ensure consistent structure.
Extract event details: - Iterate through the event data and extract details such as match ID, time, event type, team, player, start and end coordinates. - Match event IDs with 360 data if provided.
Create the resulting DataFrame: - Define columns for the DataFrame. - Convert the list of event details to a DataFrame and return it.
The returned DataFrame contains the following columns:
match_id: Identifier for the match.period: Period of the match in which the event occurred.time: Time of the event.minute: Minute of the event.second: Second of the event.event_type: Primary type of the event.event_type_2: Secondary type of the event, if available.team: Name of the team involved in the event.home_team: Indicator if the team is the home team.player: Name of the player involved in the event.start_x: X-coordinate of the event start position.start_y: Y-coordinate of the event start position.end_x: X-coordinate of the event end position.end_y: Y-coordinate of the event end position.
If 360 data is provided, the DataFrame also includes:
h1_teammate, h1_actor, h1_keeper, h1_x, h1_y, ..., h11_teammate, h11_actor, h11_keeper, h11_x, h11_y: Freeze frame data for each player on the home team.a1_teammate, a1_actor, a1_keeper, a1_x, a1_y, ..., a11_teammate, a11_actor, a11_keeper, a11_x, a11_y: Freeze frame data for each player on the away team.