Football Analytics - Association Rule Mining (ARM)

Overview

Association Rule Mining (ARM) is a machine learning technique used to uncover relationships between items in large datasets. It is widely applied in market basket analysis, recommendation systems, and other data-driven decision-making processes.

A simple view or clustering and ARM.

Key measures used in ARM include:

  • Support: Measures how frequently an item or itemset appears in the dataset.
  • Confidence: Indicates the likelihood that an item Y appears in a transaction given that item X is already present.
  • Lift: Evaluates how much more likely Y is to appear with X compared to random chance.
    • Lift > 1: Items are positively correlated (strong association).
    • Lift = 1: No correlation.
    • Lift < 1: Items are negatively correlated.

What Are Association Rules?

Association rules take the form of X → Y, meaning that if X occurs in a transaction, Y is likely to occur as well. Example:

  • {Milk, Bread} → {Butter}
    • If a customer buys milk and bread, they are likely to buy butter as well.

Apriori Algorithm

The Apriori algorithm is a popular method for ARM that works as follows:

  1. Identify frequent itemsets that meet the minimum support threshold.
  2. Use these itemsets to generate association rules that satisfy confidence and lift thresholds.
  3. Prune non-significant rules to improve efficiency.
Apriori Algorithm

Apriori Algorithm

Data Preparation

Association Rule Mining requires unlabeled transaction data, meaning datasets where each row represents a collection of items purchased together or characteristics occurring together. Unlike supervised learning, ARM does not rely on predefined labels.

Sample Data Structure:

Example of transaction data format:

Transaction ID Item 1 Item 2 Item 3
1 Milk Bread Butter
2 Bread Butter  
3 Milk Cheese Butter

Data Used in This Analysis:

The dataset consists of categorical variables related to football players, including positions, outfitter brands, and dominant foot.

Snapshot of the dataset:

Dataset used for ARM

Dataset used for ARM

Results

After running the Apriori algorithm, we extracted key association rules based on support, confidence, and lift.

Top 15 Rules by Lift

Antecedents Consequents Support Confidence Lift
Position_Goalkeeper Foot_hand 0.102 1.000 9.768
Foot_hand Position_Goalkeeper 0.102 1.000 9.768
Position_Left-Back Foot_left 0.070 0.965 4.026
Position_Right-Back Foot_right 0.072 0.978 1.540
Position_Defensive Midfield Foot_right 0.069 0.842 1.327
Outfitter_Nike, Position_Centre-Forward Foot_right 0.013 0.834 1.314
Position_Centre-Forward Foot_right 0.116 0.831 1.308
Position_Centre-Forward, Outfitter_adidas Foot_right 0.011 0.829 1.306
Position_Centre-Back, Club_name_Without Club Foot_right 0.012 0.814 1.282
Club_name_Without Club, Position_Centre-Forward Foot_right 0.011 0.806 1.269
Outfitter_adidas, Position_Central Midfield Foot_right 0.012 0.798 1.257
Position_Central Midfield Foot_right 0.091 0.785 1.236
Position_Left Winger Foot_right 0.060 0.777 1.224
Position_Centre-Back Foot_right 0.122 0.715 1.127
Position_Centre-Back, Outfitter_Nike Foot_right 0.011 0.680 1.071

Top 15 Rules by Support

Antecedents Consequents Support Confidence Lift
Position_Centre-Back Foot_right 0.122 0.715 1.127
Position_Centre-Forward Foot_right 0.116 0.831 1.308
Position_Goalkeeper Foot_hand 0.102 1.000 9.768
Foot_hand Position_Goalkeeper 0.102 1.000 9.768
Position_Central Midfield Foot_right 0.091 0.785 1.236
Position_Right-Back Foot_right 0.072 0.978 1.540
Position_Left-Back Foot_left 0.070 0.965 4.026
Position_Defensive Midfield Foot_right 0.069 0.842 1.327
Outfitter_Nike Foot_right 0.067 0.659 1.038
Position_Left Winger Foot_right 0.060 0.777 1.224
Outfitter_adidas Foot_right 0.059 0.642 1.011
Club_name_Without Club Foot_right 0.057 0.656 1.033
Position_Attacking Midfield Foot_right 0.047 0.668 1.052
Position_Right Winger Foot_right 0.041 0.560 0.882
Outfitter_Puma Foot_right 0.020 0.639 1.006

Top 15 Rules by Confidence

Antecedents Consequents Support Confidence Lift
Position_Goalkeeper Foot_hand 0.102 1.000 9.768
Foot_hand Position_Goalkeeper 0.102 1.000 9.768
Position_Right-Back Foot_right 0.072 0.978 1.540
Position_Left-Back Foot_left 0.070 0.965 4.026
Position_Defensive Midfield Foot_right 0.069 0.842 1.327
Outfitter_Nike, Position_Centre-Forward Foot_right 0.013 0.834 1.314
Position_Centre-Forward Foot_right 0.116 0.831 1.308
Position_Centre-Forward, Outfitter_adidas Foot_right 0.011 0.829 1.306
Position_Centre-Back, Club_name_Without Club Foot_right 0.012 0.814 1.282
Club_name_Without Club, Position_Centre-Forward Foot_right 0.011 0.806 1.269
Outfitter_adidas, Position_Central Midfield Foot_right 0.012 0.798 1.257
Position_Central Midfield Foot_right 0.091 0.785 1.236
Position_Left Winger Foot_right 0.060 0.777 1.224
Position_Centre-Back Foot_right 0.122 0.715 1.127
Position_Centre-Back, Outfitter_Nike Foot_right 0.011 0.680 1.071

Thresholds Used:

  • Minimum Support: 0.01
  • Minimum Confidence: 0.5
  • Minimum Lift: 1.2

Visualization: Association Rule Network

The network graph below visualizes the relationships found among the top 15 rules:

Apriori Vizualization

Apriori Vizualization


Conclusions

From the Association Rule Mining analysis, we found several interesting insights:

  1. Strong Position-Foot Correlations
    • Goalkeepers always have “Foot_hand” (100% confidence).
    • Left-backs predominantly use their left foot (96% confidence).
    • Right-backs predominantly use their right foot (97% confidence).
  2. Outfitter Preferences and Player Roles
    • Nike-sponsored Centre-Forwards tend to be right-footed (83% confidence).
    • Adidas-sponsored Centre-Forwards follow a similar trend (82% confidence).
  3. Clubs and Footedness
    • Players without a club tend to be Centre-Backs or Centre-Forwards, and most are right-footed.

These insights can help in player scouting, team strategy development, and sponsorship decisions.

This structured tab provides a complete breakdown of the Association Rule Mining process—from methodology to results—giving users clear insights into data-driven patterns.

The code for implementing ARM can be found here

Back to Home