Skip to contents

Overview

This package is designed to allow users to extract various world football results and player statistics from the following popular football (soccer) data sites:

Installation

You can install the CRAN version of worldfootballR with:

install.packages("worldfootballR")

You can install the released version of worldfootballR from GitHub with:

# install.packages("devtools")
devtools::install_github("JaseZiv/worldfootballR")

Usage

Package vignettes have been built to help you get started with the package.

  • For functions to extract data from FBref, see here
  • For functions to extract data from Understat, see here
  • For functions to extract data from fotmob, see here
  • For functions to extract data for international matches from FBref, see here
  • For functions to load pre-scraped data, see here

This vignette will cover the functions to extract data from transfermarkt.com


Join FBref and Transfermarkt data

To be able to join data player between FBref and Transfermarkt, player_dictionary_mapping() has been created. There are over 6,100 players who have been listed for teams in the Big 5 Euro leagues on FBref since the start of the 2017-18 seasons, with all of these mapped together. This is expected to be updated and grow over time. The raw data is stored here

mapped_players <- player_dictionary_mapping()
dplyr::glimpse(mapped_players)
#> Rows: 6,612
#> Columns: 4
#> $ PlayerFBref <chr> "Aaron Connolly", "Aaron Cresswell", "Aarón Escandell", "A…
#> $ UrlFBref    <chr> "https://fbref.com/en/players/27c01749/Aaron-Connolly", "h…
#> $ UrlTmarkt   <chr> "https://www.transfermarkt.com/aaron-connolly/profil/spiel…
#> $ TmPos       <chr> "Centre-Forward", "Left-Back", "Goalkeeper", "Attacking Mi…

Transfermarkt Helper Functions

The following section will outline the various functions available to find different URLs to be able to pass through the Transfermarkt suite of functions outlined in this vignette.

Team URLs

To get a list of URLs for each team in a particular season from transfermarkt.com, the tm_league_team_urls() function can be used. If the country/countries aren’t available in the main data set, the function can also accept a League URL from transfermarkt.com. To get the league URL, use the filtering options towards the top of transfermarkt.com, select the country and league you want to collect data from, head to that page, and copy the URL.

team_urls <- tm_league_team_urls(country_name = "England", start_year = 2020)
# if it's not a league in the stored leagues data in worldfootballR_data repo:
league_one_teams <- tm_league_team_urls(start_year = 2020, league_url = "https://www.transfermarkt.com/league-one/startseite/wettbewerb/GB3")

Player URLs

To get a list of player URLs for a particular team in transfermarkt.com, the tm_team_player_urls() function can be used.

tm_team_player_urls(team_url = "https://www.transfermarkt.com/fc-burnley/startseite/verein/1132/saison_id/2020")

Staff URLs

To get a list of staff URLs for a particular team(s) and staff role in transfermarkt.com, the tm_league_staff_urls() function can be used.

The staff roles that can be passed to the function via the staff_role argument are below:

  • “Manager” (this will also return caretaker managers)
  • “Assistant Manager”
  • “Goalkeeping Coach”
  • “Fitness Coach”
  • “Conditioning Coach”
# get a list of team URLs for the EPL 2021/22 season
epl_teams <- tm_league_team_urls(country_name = "England", start_year = 2021)
# get all EPL managers for the 2021/22 season
epl_managers <- tm_team_staff_urls(team_urls = epl_teams, staff_role = "Manager")

# get all EPL goal keeping coaches for the 2021/22 season
epl_gk_coaches <- tm_team_staff_urls(team_urls = epl_teams, staff_role = "Goalkeeping Coach")

League Season-Level Data

This section will cover the functions to aid in the extraction of season team statistics and information for whole leagues.

League Table by Matchdays

To be able to extract league tables for select matchday(s), the below function can be used.

The function can accept either the country name, season start year and matchday number(s), or for leagues not contained in the worldfootballR_data repository, it can accept the league URL, season start year and matchday number(s).

#----- to get the EPL table after matchday 1 of the 20/21 season: -----#
epl_matchday_1_table <- tm_matchday_table(country_name="England", start_year="2020", matchday=1)
dplyr::glimpse(epl_matchday_1_table)
#> Rows: 40
#> Columns: 13
#> $ country  <chr> "England", "England", "England", "England", "England", "Engla…
#> $ league   <chr> "Games of Matchday 1 Premier League 20/21 ", "Games of Matchd…
#> $ matchday <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
#> $ rk       <int> 1, 1, 3, 4, 4, 4, 7, 8, 8, 8, 11, 12, 12, 12, 15, 16, 16, 16,…
#> $ squad    <chr> "Arsenal", "Leicester", "Chelsea", "Newcastle", "Wolves", "Ma…
#> $ p        <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
#> $ w        <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1…
#> $ d        <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
#> $ l        <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0…
#> $ gf       <dbl> 3, 3, 3, 2, 2, 2, 4, 1, 1, 1, 3, 0, 0, 0, 1, 0, 0, 0, 0, 0, 3…
#> $ ga       <dbl> 0, 0, 1, 0, 0, 0, 3, 0, 0, 0, 4, 1, 1, 1, 3, 2, 2, 2, 3, 3, 0…
#> $ g_diff   <dbl> 3, 3, 2, 2, 2, 2, 1, 1, 1, 1, -1, -1, -1, -1, -2, -2, -2, -2,…
#> $ pts      <dbl> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3…

# #----- to get the EPL table after each matchdays from matchday 1 to matchday 35 of the 20/21 season: -----#
# epl_matchday_1to35_table <- tm_matchday_table(country_name="England", start_year="2020", matchday=c(1:35))

#----- to get the League One table after each matchdays from matchday 1 to matchday 5 of the 20/21 season: -----#
league_one_matchday_1_table <- tm_matchday_table(start_year="2020", matchday=1:5,
                                                 league_url="https://www.transfermarkt.com/league-one/startseite/wettbewerb/GB3")
dplyr::glimpse(league_one_matchday_1_table)
#> Rows: 240
#> Columns: 13
#> $ country  <chr> "England", "England", "England", "England", "England", "Engla…
#> $ league   <chr> "Games of Matchday 1 League One 20/21 ", "Games of Matchday 1…
#> $ matchday <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
#> $ rk       <int> 1, 2, 2, 2, 2, 2, 7, 8, 9, 9, 11, 11, 11, 11, 15, 15, 17, 18,…
#> $ squad    <chr> "Swindon Town", "Lincoln City", "Charlton", "Ipswich", "Accri…
#> $ p        <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
#> $ w        <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
#> $ d        <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0…
#> $ l        <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1…
#> $ gf       <dbl> 3, 2, 2, 2, 2, 2, 2, 1, 2, 2, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0…
#> $ ga       <dbl> 1, 0, 0, 0, 0, 0, 1, 0, 2, 2, 1, 1, 1, 1, 0, 0, 2, 1, 3, 2, 2…
#> $ g_diff   <dbl> 2, 2, 2, 2, 2, 2, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, -1, -1, -2, -…
#> $ pts      <dbl> 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0…

League Debutants

To be able to extract all debutants making either their league or professional debut, the tm_league_debutants() function can be used.

To see all league debutants (regardless of their professional status), set debut_type = "league", while setting debut_type = "pro" will only return debutants in the selected league who are making their professional debuts.

The variables debut_start_year and debut_end_year set time periods on when debutant data is required for. As with all transfermarkt functions, this season value is the starting year if the season, ie for the 2021-22 LaLiga season, this value is set to 2021.

# Laliga players making their LaLiga debut in 2021/2022
laliga_debutants <- tm_league_debutants(country_name = "Spain", debut_type = "league", debut_start_year = 2021, debut_end_year = 2021)
dplyr::glimpse(laliga_debutants)
#> Rows: 200
#> Columns: 20
#> $ comp_name           <chr> "LaLiga", "LaLiga", "LaLiga", "LaLiga", "LaLiga", …
#> $ country             <chr> "Spain", "Spain", "Spain", "Spain", "Spain", "Spai…
#> $ comp_url            <chr> "https://www.transfermarkt.com/laliga/startseite/w…
#> $ player_name         <chr> "Gavi", "Tomás Mendes", "Cristhian Mosquera", "Ili…
#> $ player_url          <chr> "https://www.transfermarkt.com/gavi/profil/spieler…
#> $ position            <chr> "Central Midfield", "Central Midfield", "Centre-Ba…
#> $ nationality         <chr> "Spain", "Spain", "Spain", "Spain", "Canada", "Spa…
#> $ second_nationality  <chr> NA, "Guinea-Bissau", "Colombia", "Morocco", "Unite…
#> $ debut_for           <chr> "FC Barcelona", "Deportivo Alavés", "Valencia CF",…
#> $ debut_date          <date> 2021-08-29, 2022-05-07, 2022-01-19, 2021-11-20, 2…
#> $ opponent            <chr> "Getafe CF", "Celta de Vigo", "Sevilla FC", "RCD E…
#> $ goals_for           <dbl> 2, 0, 1, 1, 0, 3, 0, 1, 1, 1, 3, 1, 1, 3, 2, 3, 4,…
#> $ goals_against       <dbl> 1, 4, 1, 0, 0, 1, 2, 1, 1, 1, 0, 1, 0, 4, 0, 0, 3,…
#> $ age_debut           <chr> "17 years 24 days", "17 years 05 months 16 days", …
#> $ value_at_debut      <dbl> NA, NA, NA, 1.0e+05, NA, NA, NA, 5.0e+05, 8.0e+06,…
#> $ player_market_value <dbl> 7.0e+07, 5.0e+04, 1.0e+06, 3.0e+06, 1.0e+06, 1.0e+…
#> $ appearances         <dbl> 34, 1, 6, 2, 1, 1, 3, 5, 6, 2, 1, 1, 1, 11, 14, 15…
#> $ goals               <dbl> 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,…
#> $ minutes_played      <dbl> 2326, 8, 209, 126, 0, 90, 79, 187, 189, 111, 10, 4…
#> $ debut_type          <chr> "league", "league", "league", "league", "league", …

# English League One players making their PRO debuts in 2021/2022
league_one_PRO_debutants <- tm_league_debutants(country_name = "", league_url = "https://www.transfermarkt.com/league-one/startseite/wettbewerb/GB3", debut_type = "pro", debut_start_year = 2021, debut_end_year = 2021)
dplyr::glimpse(league_one_PRO_debutants)
#> Rows: 167
#> Columns: 20
#> $ comp_name           <chr> "League One", "League One", "League One", "League …
#> $ country             <chr> "England", "England", "England", "England", "Engla…
#> $ comp_url            <chr> "https://www.transfermarkt.com/league-one/profideb…
#> $ player_name         <chr> "Joseph Gbode", "Kai Yearn", "Will Jenkins-Davies"…
#> $ player_url          <chr> "https://www.transfermarkt.com/joseph-gbode/profil…
#> $ position            <chr> "midfield", "Attacking Midfield", "Central Midfiel…
#> $ nationality         <chr> "England", "England", "Wales", "England", "England…
#> $ second_nationality  <chr> NA, NA, "England", NA, NA, NA, NA, NA, NA, NA, NA,…
#> $ debut_for           <chr> "Gillingham FC", "Cambridge United", "Plymouth Arg…
#> $ debut_date          <date> 2021-11-20, 2022-04-30, 2021-11-13, 2021-11-23, 2…
#> $ opponent            <chr> "Crewe Alexandra", "Cheltenham Town", "Accrington …
#> $ goals_for           <dbl> 0, 2, 4, 0, 1, 1, 3, 1, 0, 1, 2, 1, 2, 1, 1, 0, 0,…
#> $ goals_against       <dbl> 2, 2, 1, 2, 1, 2, 4, 1, 2, 1, 1, 2, 1, 4, 1, 2, 2,…
#> $ age_debut           <chr> "16 years 07 months 12 days", "16 years 11 months …
#> $ value_at_debut      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ player_market_value <dbl> NA, NA, 100000, NA, NA, NA, NA, NA, NA, NA, 300000…
#> $ appearances         <dbl> 2, 1, 1, 1, 8, 2, 1, 1, 19, 1, 36, 3, 17, 1, 5, 2,…
#> $ goals               <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0,…
#> $ minutes_played      <dbl> 16, 2, 0, 4, 316, 8, 83, 13, 1552, 90, 3097, 78, 1…
#> $ debut_type          <chr> "pro", "pro", "pro", "pro", "pro", "pro", "pro", "…

Expiring Contracts

To be able to extract a list of all players whose contracts expire in a selected year, the tm_expiring_contracts() can be used.

Set the contract_end_year to be equal to the calendar year the contracts are due to expire.

#----- LaLiga players with expiring contracts in 2022: -----#
laliga_expiring <- tm_expiring_contracts(country_name = "Spain", contract_end_year = 2023)
dplyr::glimpse(laliga_expiring)
#> Rows: 111
#> Columns: 14
#> $ comp_name           <chr> "LaLiga", "LaLiga", "LaLiga", "LaLiga", "LaLiga", …
#> $ country             <chr> "Spain", "Spain", "Spain", "Spain", "Spain", "Spai…
#> $ comp_url            <chr> "https://www.transfermarkt.com/laliga/startseite/w…
#> $ player_name         <chr> "José Gayà", "Thomas Lemar", "Marco Asensio", "Kar…
#> $ player_url          <chr> "https://www.transfermarkt.com/jose-gaya/profil/sp…
#> $ position            <chr> "Left-Back", "Attacking Midfield", "Right Winger",…
#> $ nationality         <chr> "Spain", "France", "Spain", "France", "Spain", "Ne…
#> $ second_nationality  <chr> NA, "Guadeloupe", "Netherlands", "Algeria", NA, "G…
#> $ current_club        <chr> "Valencia CF", "Atlético de Madrid", "Real Madrid"…
#> $ contract_expiry     <date> 2023-06-30, 2023-06-30, 2023-06-30, 2023-06-30, 2…
#> $ contract_option     <chr> "-", "-", "-", "-", "-", "-", "-", "-", "-", "-", …
#> $ player_market_value <dbl> 3.5e+07, 3.5e+07, 3.5e+07, 3.0e+07, 2.5e+07, 2.5e+…
#> $ transfer_fee        <dbl> NA, 72000000, 3900000, 35000000, NA, 0, 25000000, …
#> $ agent               <chr> "Toldra Consulting S.L.", "Kemari", "Gestifute", "…

#----- Can even do it for non-standard leagues - English League One players with expiring contracts in 2022: -----#
# league_one_expiring <- tm_expiring_contracts(country_name = "",
#                                                contract_end_year = 2023,
#                                                league_url = "https://www.transfermarkt.com/league-one/startseite/wettbewerb/GB3")

League Injuries

To get a list of all reported current injuries for a selected league, use the tm_league_injuries() function:

# to get all current injuries for LaLiga
laliga_injuries <- tm_league_injuries(country_name = "Spain")
dplyr::glimpse(laliga_injuries)
#> Rows: 40
#> Columns: 14
#> $ comp_name           <chr> "LaLiga", "LaLiga", "LaLiga", "LaLiga", "LaLiga", …
#> $ country             <chr> "Spain", "Spain", "Spain", "Spain", "Spain", "Spai…
#> $ comp_url            <chr> "https://www.transfermarkt.com/laliga/startseite/w…
#> $ player_name         <chr> "Enes Ünal", "Sergio Reguilón", "Mikel Balenziaga"…
#> $ player_url          <chr> "https://www.transfermarkt.com/enes-unal/profil/sp…
#> $ position            <chr> "Centre-Forward", "Left-Back", "Left-Back", "Left …
#> $ current_club        <chr> "Getafe CF", "Atlético de Madrid", "Athletic Bilba…
#> $ age                 <dbl> 25, 25, 34, 26, 31, 22, 29, 27, 25, 21, 25, 28, 29…
#> $ nationality         <chr> "Turkey", "Spain", "Spain", "Senegal", "Montenegro…
#> $ second_nationality  <chr> NA, NA, NA, "Spain", "Serbia", NA, NA, NA, "Nigeri…
#> $ injury              <chr> "Unknown Injury", "Pubalgia", "Ankle Injury", "Fin…
#> $ injured_since       <date> 2022-09-22, 2022-08-21, 2022-08-07, 2022-08-10, 2…
#> $ injured_until       <date> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ player_market_value <dbl> 2.5e+07, 2.5e+07, 1.2e+06, 2.0e+06, 1.4e+07, 6.0e+…

#----- Can even do it for non-standard leagues - get all current injuries for League One in England
# league_one_injuries <- tm_league_injuries(country_name = "",
#                                                league_url = "https://www.transfermarkt.com/league-one/startseite/wettbewerb/GB3")

Team Data

This section will cover off the functions to get team-level data from Transfermarkt.

Transfer activity by team

To get all the arrivals and departures for a team (or teams) in a season and data regarding the transfer (transfer value, contract length, where they came from/went to, etc), the tm_team_transfers() function can be used. This function can return either summer, winter or all for both transfer_windows:

#----- for one team: -----#
bayern <- tm_team_transfers(team_url = "https://www.transfermarkt.com/fc-bayern-munchen/startseite/verein/27/saison_id/2020", transfer_window = "all")
dplyr::glimpse(bayern)
#> Rows: 25
#> Columns: 21
#> $ team_name          <chr> "Bayern Munich", "Bayern Munich", "Bayern Munich", …
#> $ league             <chr> "Bundesliga", "Bundesliga", "Bundesliga", "Bundesli…
#> $ country            <chr> "Germany", "Germany", "Germany", "Germany", "German…
#> $ season             <chr> "2020", "2020", "2020", "2020", "2020", "2020", "20…
#> $ transfer_type      <chr> "Arrivals", "Arrivals", "Arrivals", "Arrivals", "Ar…
#> $ player_name        <chr> "Leroy Sané", "Marc Roca", "Bouna Sarr", "Douglas C…
#> $ player_url         <chr> "https://www.transfermarkt.com/leroy-sane/profil/sp…
#> $ player_position    <chr> "Left Winger", "Defensive Midfield", "Right Midfiel…
#> $ player_age         <chr> "24", "23", "28", "30", "18", "23", "31", "19", "19…
#> $ player_nationality <chr> "Germany", "Spain", "Senegal", "Brazil", "France", …
#> $ club_2             <chr> "Man City", "Espanyol", "Marseille", "Juventus", "P…
#> $ league_2           <chr> "Premier League", "LaLiga2", "Ligue 1", "Serie A", …
#> $ country_2          <chr> "England", "Spain", "France", "Italy", "France", "G…
#> $ transfer_fee       <dbl> 6.0e+07, 9.0e+06, 8.0e+06, 2.5e+05, 0.0e+00, 0.0e+0…
#> $ is_loan            <lgl> FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, TRU…
#> $ transfer_notes     <chr> NA, NA, NA, NA, NA, NA, NA, "loan transfer", "-", "…
#> $ window             <chr> "Summer", "Summer", "Summer", "Summer", "Summer", "…
#> $ in_squad           <dbl> 45, 37, 38, 26, 11, 40, 42, 14, 12, 16, 50, 3, NA, …
#> $ appearances        <dbl> 44, 11, 15, 20, 6, 4, 32, 2, 5, 7, 37, 0, NA, 0, 1,…
#> $ goals              <dbl> 10, 0, 0, 1, 0, 0, 9, 0, 0, 0, 7, 0, NA, 0, 0, 0, N…
#> $ minutes_played     <dbl> 2627, 514, 810, 662, 107, 360, 1312, 69, 102, 230, …

#----- or for multiple teams: -----#
# team_urls <- tm_league_team_urls(country_name = "England", start_year = 2020)
# epl_xfers_2020 <- tm_team_transfers(team_url = team_urls, transfer_window = "all")

Squad Player Stats

To get basic statistics (goals, appearances, minutes played, etc) for all games played by players for a squad season, the tm_squad_stats() function can be used:

#----- for one team: -----#
bayern <- tm_squad_stats(team_url = "https://www.transfermarkt.com/fc-bayern-munchen/startseite/verein/27/saison_id/2020")
dplyr::glimpse(bayern)
#> Rows: 45
#> Columns: 12
#> $ team_name      <chr> "Bayern Munich", "Bayern Munich", "Bayern Munich", "Bay…
#> $ league         <chr> "Bundesliga", "Bundesliga", "Bundesliga", "Bundesliga",…
#> $ country        <chr> "Germany", "Germany", "Germany", "Germany", "Germany", …
#> $ player_name    <chr> "Manuel Neuer", "Alexander Nübel", "Sven Ulreich", "Ron…
#> $ player_url     <chr> "https://www.transfermarkt.com/manuel-neuer/profil/spie…
#> $ player_pos     <chr> "Goalkeeper", "Goalkeeper", "Goalkeeper", "Goalkeeper",…
#> $ player_age     <dbl> 34, 23, 31, 21, 18, 19, 28, 24, 24, 24, 20, 18, 20, 31,…
#> $ nationality    <chr> "Germany", "Germany", "Germany", "Germany", "Germany", …
#> $ in_squad       <dbl> 47, 40, 3, 12, 4, 39, 46, 46, 40, 44, 16, 11, 4, 47, 3,…
#> $ appearances    <dbl> 46, 4, 0, 0, 0, 35, 45, 37, 33, 36, 7, 6, 1, 39, 1, 0, …
#> $ goals          <dbl> 0, 0, 0, 0, 0, 1, 2, 1, 2, 1, 0, 0, 0, 2, 0, 0, 0, 0, 6…
#> $ minutes_played <dbl> 4200, 360, 0, 0, 0, 2589, 3793, 2705, 2517, 2925, 230, …

#----- or for multiple teams: -----#
# team_urls <- tm_league_team_urls(country_name = "England", start_year = 2020)
# epl_team_players_2020 <- tm_squad_stats(team_url = team_urls)

Player Valuations

To get player valuations for all teams in a league season, use the tm_player_market_values() function:

#----- Can do it for a single league: -----#
a_league_valuations <- tm_player_market_values(country_name = "Australia",
                                       start_year = 2021)
dplyr::glimpse(a_league_valuations)
#> Rows: 383
#> Columns: 19
#> $ comp_name                <chr> "A-League Men", "A-League Men", "A-League Men…
#> $ region                   <chr> "Asia", "Asia", "Asia", "Asia", "Asia", "Asia…
#> $ country                  <chr> "Australia", "Australia", "Australia", "Austr…
#> $ season_start_year        <int> 2021, 2021, 2021, 2021, 2021, 2021, 2021, 202…
#> $ squad                    <chr> "Melbourne City FC", "Melbourne City FC", "Me…
#> $ player_num               <chr> "47", "-", "1", "-", "-", "22", "4", "-", "36…
#> $ player_name              <chr> "Jordi Valadon", "Luke Oresti", "Tom Glover",…
#> $ player_position          <chr> "midfield", "midfield", "Goalkeeper", "Goalke…
#> $ player_dob               <date> 2003-03-04, 2003-06-30, 1997-12-24, 2000-03-…
#> $ player_age               <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ player_nationality       <chr> "Australia", "Australia", "Australia", "Austr…
#> $ current_club             <chr> "Melbourne City FC U21", "Melbourne City FC U…
#> $ player_height_mtrs       <chr> NA, NA, "1.96", "1.8", NA, "1.87", "1.82", "1…
#> $ player_foot              <chr> " ", " ", " ", " ", "right", "left", "right",…
#> $ date_joined              <chr> NA, NA, "2019-08-06", "2020-09-10", "2020-12-…
#> $ joined_from              <chr> NA, NA, "Tottenham Hotspur U23", "Melbourne V…
#> $ contract_expiry          <chr> NA, NA, "2023-06-30", "2023-06-30", "2024-06-…
#> $ player_market_value_euro <dbl> 25000, NA, 750000, 150000, 50000, 550000, 500…
#> $ player_url               <chr> "https://www.transfermarkt.com/jordi-valadon/…

#----- Can also do it for multiple leagues: -----#
# big_5_valuations <- tm_player_market_values(country_name = c("England", "Spain", "France", "Italy", "Germany"),
#                                        start_year = 2021)

#----- Can also do it for non standard leagues: -----#
# league_one_valuations <- tm_player_market_values(country_name = "",
#                                        start_year = 2021,
#                                        league_url = "https://www.transfermarkt.com/league-one/startseite/wettbewerb/GB3")

Player Data

This section will cover the functions available to aid in the extraction of player data.

Player Bios

To get information about a player, like their age, foot, where they were born, who they play for, their contract details, social media accounts and a whole lot more, use the tm_player_bio() function.:

#----- for a single player: -----#
hazard_bio <- tm_player_bio(player_url = "https://www.transfermarkt.com/eden-hazard/profil/spieler/50202")
dplyr::glimpse(hazard_bio)
#> Rows: 1
#> Columns: 21
#> $ player_name               <chr> "Eden Hazard"
#> $ name_in_home_country      <chr> "Eden Michael Hazard"
#> $ date_of_birth             <chr> "1991-01-07"
#> $ place_of_birth            <chr> "La Louvière"
#> $ age                       <chr> "31"
#> $ height                    <dbl> 1.75
#> $ citizenship               <chr> "Belgium"
#> $ position                  <chr> "attack - Left Winger"
#> $ foot                      <chr> "right"
#> $ player_agent              <chr> "Relatives"
#> $ current_club              <chr> "Real Madrid"
#> $ joined                    <chr> "2019-07-01"
#> $ contract_expires          <chr> "2024-06-30"
#> $ outfitter                 <chr> "Nike"
#> $ twitter                   <chr> "http://twitter.com/hazardeden10"
#> $ facebook                  <chr> "http://www.facebook.com/edenhazard/"
#> $ instagram                 <chr> "http://www.instagram.com/hazardeden_10/"
#> $ player_valuation          <dbl> 1.2e+07
#> $ max_player_valuation      <dbl> 1.5e+08
#> $ max_player_valuation_date <chr> "2018-10-17"
#> $ URL                       <chr> "https://www.transfermarkt.com/eden-hazard/p…

#----- for multiple players: -----#
# # can make use of a tm helper function:
# burnley_player_urls <- tm_team_player_urls(team_url = "https://www.transfermarkt.com/fc-burnley/startseite/verein/1132/saison_id/2020")
# # then pass all those URLs to the tm_player_bio
# burnley_bios <- tm_player_bio(player_urls = burnley_player_urls)

Player Injury History

To be able to get an individual player(s) injury history from transfermarkt, use the tm_player_injury_history() function.

#----- for a single player: -----#
hazard_injuries <- tm_player_injury_history(player_urls = "https://www.transfermarkt.com/eden-hazard/profil/spieler/50202")
dplyr::glimpse(hazard_injuries)
#> Rows: 30
#> Columns: 9
#> $ player_name    <chr> "\n                                        \n          …
#> $ player_url     <chr> "https://www.transfermarkt.com/eden-hazard/profil/spiel…
#> $ season_injured <chr> "21/22", "21/22", "21/22", "21/22", "20/21", "20/21", "…
#> $ injury         <chr> "Fissure of the fibula", "Abdominal Influenza", "Muscul…
#> $ injured_since  <date> 2022-03-25, 2021-11-19, 2021-10-08, 2021-06-28, 2021-0…
#> $ injured_until  <date> 2022-05-14, 2021-11-30, 2021-10-22, 2021-08-12, 2021-0…
#> $ duration       <chr> "50 days", "11 days", "14 days", "45 days", "2 days", "…
#> $ games_missed   <chr> "11", "3", "1", NA, "1", "8", "7", "8", "1", "6", "2", …
#> $ club           <chr> "Real Madrid", "Real Madrid", "Real Madrid", "Real Madr…

#----- for multiple players: -----#
# # can make use of a tm helper function:
# burnley_player_urls <- tm_team_player_urls(team_url = "https://www.transfermarkt.com/fc-burnley/startseite/verein/1132/saison_id/2021")
# # then pass all those URLs to the tm_player_bio
# burnley_player_injuries <- tm_player_injury_history(player_urls = burnley_player_urls)

Club Staff Data

From version 0.4.7, users now have the ability to get historical data for club staff from transfermarkt.

The following two functions can be used, depending on the need (in addition to the helper function tm_team_staff_urls() detailed above).

Club Staff History

You can extract all employees by role in a club’s history using tm_team_staff_history().

The list of roles that can be passed to the staff_roles argument can be found here, and they’re also listed below:

# get a list of team URLs for the EPL 2021/22 season
epl_teams <- tm_league_team_urls(country_name = "England", start_year = 2021)
#----- then use the URLs to pass to the function, and select the role you wish to see results for: -----#
club_manager_history <- tm_team_staff_history(team_urls = epl_teams, staff_role = "Manager")
dplyr::glimpse(club_manager_history)
#> Rows: 556
#> Columns: 17
#> $ team_name                   <chr> "Manchester City", "Manchester City", "Man…
#> $ league                      <chr> "Premier League", "Premier League", "Premi…
#> $ country                     <chr> "England", "England", "England", "England"…
#> $ staff_role                  <chr> "Manager", "Manager", "Manager", "Manager"…
#> $ staff_name                  <chr> "Pep Guardiola", "Manuel Pellegrini", "Rob…
#> $ staff_url                   <chr> "https://www.transfermarkt.com/pep-guardio…
#> $ staff_dob                   <chr> "Jan 18, 1971", "Sep 16, 1953", "Nov 27, 1…
#> $ staff_nationality           <chr> "Spain", "Chile", "Italy", "Wales", "Swede…
#> $ staff_nationality_secondary <chr> NA, "Italy", NA, NA, NA, NA, NA, NA, NA, N…
#> $ appointed                   <date> 2016-07-01, 2013-07-01, 2009-12-19, 2008-…
#> $ end_date                    <date> NA, 2016-06-30, 2013-05-13, 2009-12-19, 2…
#> $ days_in_post                <dbl> 2278, 1095, 1241, 563, 332, 794, 1386, 118…
#> $ matches                     <dbl> 361, 166, 191, 77, 45, 97, 176, 108, 31, 0…
#> $ wins                        <dbl> 267, 101, 113, 37, 19, 34, 77, 45, 7, 0, 9…
#> $ draws                       <dbl> 41, 27, 38, 15, 11, 19, 39, 24, 9, 0, 11, …
#> $ losses                      <dbl> 53, 38, 40, 25, 15, 44, 60, 39, 15, 0, 20,…
#> $ ppg                         <dbl> 2.33, 1.99, 1.97, 1.64, 1.51, 1.25, 1.53, …

#----- can also get other roles: -----#
# club_caretaker_manager_history <- tm_team_staff_history(team_urls = epl_teams, staff_role = "Caretaker Manager")

Staff Member’s History

To be able to get all roles held by a selected staff member(s), the tm_staff_job_history() function can be used.

The function accepts one argument, staff_urls, which can be extracted using tm_team_staff_urls() explained in the helpers section above.

# get a list of team URLs for the EPL 2021/22 season
# epl_teams <- tm_league_team_urls(country_name = "England", start_year = 2021)

# get all EPL goal keeping coaches for the 2021/22 season
epl_gk_coaches <- tm_team_staff_urls(team_urls = epl_teams[1:3], staff_role = "Goalkeeping Coach")

# then you can pass these URLs to the function and get job histories for the selected staff members
epl_gk_coach_job_histories <- tm_staff_job_history(staff_urls = epl_gk_coaches)
dplyr::glimpse(epl_gk_coach_job_histories)
#> Rows: 22
#> Columns: 23
#> $ name              <chr> "Xabier Mancisidor", "Xabier Mancisidor", "Xabier Ma…
#> $ current_club      <chr> "Man City", "Man City", "Man City", "Man City", "Man…
#> $ current_role      <chr> "Goalkeeping Coach", "Goalkeeping Coach", "Goalkeepi…
#> $ date_of_birth     <chr> "May 24, 1970", "May 24, 1970", "May 24, 1970", "May…
#> $ place_of_birth    <chr> "Pasaia", "Pasaia", "Pasaia", "Pasaia", "Pasaia", "I…
#> $ citizenship       <chr> "Spain", "Spain", "Spain", "Spain", "Spain", "Englan…
#> $ position          <chr> "Goalkeeping Coach", "Goalkeeping Coach", "Goalkeepi…
#> $ club              <chr> "Man City", "Málaga CF", "Real Madrid", "Real Socied…
#> $ appointed         <chr> "2013-07-01", "2011-01-04", "2009-06-02", "1998-10-2…
#> $ contract_expiry   <chr> NA, "2013-06-30", "2010-05-27", "2009-06-01", "1998-…
#> $ days_in_charge    <dbl> 3374, 908, 359, 3874, 113, 1548, 729, 4105, 729, 300…
#> $ matches           <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 2, NA, N…
#> $ wins              <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, NA, N…
#> $ draws             <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, N…
#> $ losses            <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, N…
#> $ players_used      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 18, NA, …
#> $ avg_goals_for     <dbl> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.…
#> $ avg_goals_against <dbl> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.…
#> $ ppm               <dbl> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.…
#> $ staff_url         <chr> "https://www.transfermarkt.com/xabier-mancisidor/pro…
#> $ coaching_licence  <chr> NA, NA, NA, NA, NA, NA, NA, "UEFA A Licence", "UEFA …
#> $ agent             <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, "Paulo Roberto C…
#> $ avg_term_as_coach <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, "0.02 Years", "0…