DataFactory
- class Event_data(data_provider='datafactory', event_path=datafactory_path).load_data() pd.DataFrame
Load the datafactory event data from a JSON file and convert it into a DataFrame.
- Parameters:
event_path (str) – Path to the JSON file containing event data.
- Returns:
DataFrame containing event data with additional columns for analysis.
- Return type:
pd.DataFrame
Example
import pandas as pd from preprocessing import Event_data datafactory_path = 'path/to/event_data.json' datafactory_df = Event_data(data_provider='datafactory', event_path=datafactory_path).load_data() print(datafactory_df.head())
Details
This function performs the following steps:
Loads the event data from a JSON file specified by event_path.
Extracts match ID and team names.
Iterates over different types of events and individual events within each type.
Extracts various event details such as period, minute, second, event type, team, player, and coordinates.
Converts extracted data into a list of events.
Converts the event list into a DataFrame.
Adds a seconds column to the DataFrame by converting minutes and seconds to total seconds.
Reorders the columns and sorts the DataFrame by the seconds column.
The returned DataFrame includes the following columns:
match_id: Match identifierperiod: Period of the match (e.g., first half, second half)minute: Minute of the eventsecond: Second of the eventseconds: Total seconds calculated from minute and secondevent_type: Type of the eventevent_type_2: Subtype of the eventteam: Team associated with the eventplayer: Player ID associated with the eventstart_x: Starting x-coordinate of the eventstart_y: Starting y-coordinate of the eventstart_z: Starting z-coordinate of the event (if available)end_x: Ending x-coordinate of the event (if available)end_y: Ending y-coordinate of the event (if available)end_z: Ending z-coordinate of the event (if available)