DataStadium
- class Event_data(data_provider='datastadium', event_path=play_csv_path, home_tracking_path=home_tracking_csv_path, away_tracking_path=away_tracking_csv_path).load_data() pd.DataFrame
Load and process stadium event and tracking data from CSV files and convert it into a DataFrame.
- Parameters:
event_path (str) – Path to the play CSV file containing event data.
home_tracking_path (str) – Path to the home_tracking CSV file containing home team tracking data.
away_tracking_path (str) – Path to the away_tracking CSV file containing away team tracking data.
- Returns:
DataFrame containing merged and processed event and tracking data.
- Return type:
pd.DataFrame
Note
The DataStadium tracking data data requires preprocessing. Which could be handled with the Tracking_data class. Ensure this preprocessing step is completed before using this function.
For more details, refer to the Tracking_data class documentation: process_datadium_tracking_data Documentation
Example for single match
import pandas as pd from preprocessing import Event_data event_path = 'path/to/play.csv' home_tracking_path = 'path/to/home_tracking.csv' away_tracking_path = 'path/to/away_tracking.csv' stadium_df = Event_data( event_path=event_path, home_tracking_path=home_tracking_path, away_tracking_path=away_tracking_path ).load_data() print(stadium_df.head())
Example for mutiple match
import pandas as pd from preprocessing import Event_data data_dir = 'path/to/data/dir' #the dir contain folders that contain the play.csv and tracking.csv files stadium_df = Event_data( event_path=data_dir ).load_data() print(stadium_df.head())
Details
This function performs the following steps:
Loads the event data from the CSV file specified by event_path.
Loads the home and away team tracking data from the CSV files specified by home_tracking_path and away_tracking_path.
Filters and preprocesses the event data by retaining required columns and sorting by absolute time.
Creates a new column event_type_2 based on event flags.
Renames columns to standardized names and reorders them.
Converts event types to English using predefined dictionaries.
Calculates the period, minute, and second for each event based on absolute time.
Appends tracking data to the event data by aligning timestamps.
Creates and returns a final DataFrame with event and tracking data.
The returned DataFrame includes the following columns:
match_id: Match identifierPeriod: Period of the match (e.g., first half, second half)Minute: Minute of the eventSecond: Second of the eventframe: Frame number of the eventabsolute_time: Absolute time in secondsteam: Team associated with the eventhome: Home or away team indicatorplayer: Player ID associated with the eventevent_type: Type of the eventevent_type_2: Subtype of the eventsuccess: Success flag of the eventstart_x: Starting x-coordinate of the eventstart_y: Starting y-coordinate of the eventdist: Distance associated with the eventopp_field: Indicator for opponent’s fieldpoint_diff: Difference in pointsself_score: Self team scoreopp_score: Opponent’s scoreangle2goal: Angle to goaldist2goal: Distance to goal
The DataFrame also includes tracking data columns for both home and away teams, such as
Home_1_x,Home_1_y,Away_1_x, andAway_1_y.