- Download the latest version of Python (3.7) and install it on your system
- Download Anaconda Distribution for Python 3.7 and install it on your system
- Run "Anaconda Prompt" and type "jupyter notebook" in the command prompt
Importing the Pandas and Matplotlib Library
import pandas as pd
import matplotlib.pyplot as pltReading the Dataset into a DataFrame
filename = 'data.csv'
data_raw = pd.read_csv(filename)Print the imported data
data_rawGet First 10 rows
data_raw.head(10)| Unnamed: 0 | ID | Name | Age | Photo | Nationality | Flag | Overall | Potential | Club | ... | Composure | Marking | StandingTackle | SlidingTackle | GKDiving | GKHandling | GKKicking | GKPositioning | GKReflexes | Release Clause | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 158023 | L. Messi | 31 | https://cdn.sofifa.org/players/4/19/158023.png | Argentina | https://cdn.sofifa.org/flags/52.png | 94 | 94 | FC Barcelona | ... | 96.0 | 33.0 | 28.0 | 26.0 | 6.0 | 11.0 | 15.0 | 14.0 | 8.0 | €226.5M |
| 1 | 1 | 20801 | Cristiano Ronaldo | 33 | https://cdn.sofifa.org/players/4/19/20801.png | Portugal | https://cdn.sofifa.org/flags/38.png | 94 | 94 | Juventus | ... | 95.0 | 28.0 | 31.0 | 23.0 | 7.0 | 11.0 | 15.0 | 14.0 | 11.0 | €127.1M |
| 2 | 2 | 190871 | Neymar Jr | 26 | https://cdn.sofifa.org/players/4/19/190871.png | Brazil | https://cdn.sofifa.org/flags/54.png | 92 | 93 | Paris Saint-Germain | ... | 94.0 | 27.0 | 24.0 | 33.0 | 9.0 | 9.0 | 15.0 | 15.0 | 11.0 | €228.1M |
| 3 | 3 | 193080 | De Gea | 27 | https://cdn.sofifa.org/players/4/19/193080.png | Spain | https://cdn.sofifa.org/flags/45.png | 91 | 93 | Manchester United | ... | 68.0 | 15.0 | 21.0 | 13.0 | 90.0 | 85.0 | 87.0 | 88.0 | 94.0 | €138.6M |
| 4 | 4 | 192985 | K. De Bruyne | 27 | https://cdn.sofifa.org/players/4/19/192985.png | Belgium | https://cdn.sofifa.org/flags/7.png | 91 | 92 | Manchester City | ... | 88.0 | 68.0 | 58.0 | 51.0 | 15.0 | 13.0 | 5.0 | 10.0 | 13.0 | €196.4M |
| 5 | 5 | 183277 | E. Hazard | 27 | https://cdn.sofifa.org/players/4/19/183277.png | Belgium | https://cdn.sofifa.org/flags/7.png | 91 | 91 | Chelsea | ... | 91.0 | 34.0 | 27.0 | 22.0 | 11.0 | 12.0 | 6.0 | 8.0 | 8.0 | €172.1M |
| 6 | 6 | 177003 | L. Modrić | 32 | https://cdn.sofifa.org/players/4/19/177003.png | Croatia | https://cdn.sofifa.org/flags/10.png | 91 | 91 | Real Madrid | ... | 84.0 | 60.0 | 76.0 | 73.0 | 13.0 | 9.0 | 7.0 | 14.0 | 9.0 | €137.4M |
| 7 | 7 | 176580 | L. Suárez | 31 | https://cdn.sofifa.org/players/4/19/176580.png | Uruguay | https://cdn.sofifa.org/flags/60.png | 91 | 91 | FC Barcelona | ... | 85.0 | 62.0 | 45.0 | 38.0 | 27.0 | 25.0 | 31.0 | 33.0 | 37.0 | €164M |
| 8 | 8 | 155862 | Sergio Ramos | 32 | https://cdn.sofifa.org/players/4/19/155862.png | Spain | https://cdn.sofifa.org/flags/45.png | 91 | 91 | Real Madrid | ... | 82.0 | 87.0 | 92.0 | 91.0 | 11.0 | 8.0 | 9.0 | 7.0 | 11.0 | €104.6M |
| 9 | 9 | 200389 | J. Oblak | 25 | https://cdn.sofifa.org/players/4/19/200389.png | Slovenia | https://cdn.sofifa.org/flags/44.png | 90 | 93 | Atlético Madrid | ... | 70.0 | 27.0 | 12.0 | 18.0 | 86.0 | 92.0 | 78.0 | 88.0 | 89.0 | €144.5M |
10 rows × 89 columns
Check out the columns data
data_raw.columnsIndex(['Unnamed: 0', 'ID', 'Name', 'Age', 'Photo', 'Nationality', 'Flag',
'Overall', 'Potential', 'Club', 'Club Logo', 'Value', 'Wage', 'Special',
'Preferred Foot', 'International Reputation', 'Weak Foot',
'Skill Moves', 'Work Rate', 'Body Type', 'Real Face', 'Position',
'Jersey Number', 'Joined', 'Loaned From', 'Contract Valid Until',
'Height', 'Weight', 'LS', 'ST', 'RS', 'LW', 'LF', 'CF', 'RF', 'RW',
'LAM', 'CAM', 'RAM', 'LM', 'LCM', 'CM', 'RCM', 'RM', 'LWB', 'LDM',
'CDM', 'RDM', 'RWB', 'LB', 'LCB', 'CB', 'RCB', 'RB', 'Crossing',
'Finishing', 'HeadingAccuracy', 'ShortPassing', 'Volleys', 'Dribbling',
'Curve', 'FKAccuracy', 'LongPassing', 'BallControl', 'Acceleration',
'SprintSpeed', 'Agility', 'Reactions', 'Balance', 'ShotPower',
'Jumping', 'Stamina', 'Strength', 'LongShots', 'Aggression',
'Interceptions', 'Positioning', 'Vision', 'Penalties', 'Composure',
'Marking', 'StandingTackle', 'SlidingTackle', 'GKDiving', 'GKHandling',
'GKKicking', 'GKPositioning', 'GKReflexes', 'Release Clause'],
dtype='object')Basic Information about columns (datatype, count etc)
data_raw.info()<class 'pandas.core.frame.DataFrame'> RangeIndex: 18207 entries, 0 to 18206 Data columns (total 89 columns): Unnamed: 0 18207 non-null int64 ID 18207 non-null int64 Name 18207 non-null object Age 18207 non-null int64 Photo 18207 non-null object Nationality 18207 non-null object Flag 18207 non-null object Overall 18207 non-null int64 Potential 18207 non-null int64 Club 17966 non-null object Club Logo 18207 non-null object Value 18207 non-null object Wage 18207 non-null object Special 18207 non-null int64 Preferred Foot 18159 non-null object International Reputation 18159 non-null float64 Weak Foot 18159 non-null float64 Skill Moves 18159 non-null float64 Work Rate 18159 non-null object Body Type 18159 non-null object Real Face 18159 non-null object Position 18147 non-null object Jersey Number 18147 non-null float64 Joined 16654 non-null object Loaned From 1264 non-null object Contract Valid Until 17918 non-null object Height 18159 non-null object Weight 18159 non-null object LS 16122 non-null object ST 16122 non-null object RS 16122 non-null object LW 16122 non-null object LF 16122 non-null object CF 16122 non-null object RF 16122 non-null object RW 16122 non-null object LAM 16122 non-null object CAM 16122 non-null object RAM 16122 non-null object LM 16122 non-null object LCM 16122 non-null object CM 16122 non-null object RCM 16122 non-null object RM 16122 non-null object LWB 16122 non-null object LDM 16122 non-null object CDM 16122 non-null object RDM 16122 non-null object RWB 16122 non-null object LB 16122 non-null object LCB 16122 non-null object CB 16122 non-null object RCB 16122 non-null object RB 16122 non-null object Crossing 18159 non-null float64 Finishing 18159 non-null float64 HeadingAccuracy 18159 non-null float64 ShortPassing 18159 non-null float64 Volleys 18159 non-null float64 Dribbling 18159 non-null float64 Curve 18159 non-null float64 FKAccuracy 18159 non-null float64 LongPassing 18159 non-null float64 BallControl 18159 non-null float64 Acceleration 18159 non-null float64 SprintSpeed 18159 non-null float64 Agility 18159 non-null float64 Reactions 18159 non-null float64 Balance 18159 non-null float64 ShotPower 18159 non-null float64 Jumping 18159 non-null float64 Stamina 18159 non-null float64 Strength 18159 non-null float64 LongShots 18159 non-null float64 Aggression 18159 non-null float64 Interceptions 18159 non-null float64 Positioning 18159 non-null float64 Vision 18159 non-null float64 Penalties 18159 non-null float64 Composure 18159 non-null float64 Marking 18159 non-null float64 StandingTackle 18159 non-null float64 SlidingTackle 18159 non-null float64 GKDiving 18159 non-null float64 GKHandling 18159 non-null float64 GKKicking 18159 non-null float64 GKPositioning 18159 non-null float64 GKReflexes 18159 non-null float64 Release Clause 16643 non-null object dtypes: float64(38), int64(6), object(45) memory usage: 12.4+ MB
Check out the shape of DataFrame object
data_raw.shape(18207, 89)
Check if there are null values in the DataFrame
data_raw.isnull().any().any() # If true is returned --> there are null values in the DataFrameTrue
Get the columns having null values
data_raw.isnull().any()Unnamed: 0 False
ID False
Name False
Age False
Photo False
Nationality False
Flag False
Overall False
Potential False
Club True
Club Logo False
Value False
Wage False
Special False
Preferred Foot True
International Reputation True
Weak Foot True
Skill Moves True
Work Rate True
Body Type True
Real Face True
Position True
Jersey Number True
Joined True
Loaned From True
Contract Valid Until True
Height True
Weight True
LS True
ST True
...
Dribbling True
Curve True
FKAccuracy True
LongPassing True
BallControl True
Acceleration True
SprintSpeed True
Agility True
Reactions True
Balance True
ShotPower True
Jumping True
Stamina True
Strength True
LongShots True
Aggression True
Interceptions True
Positioning True
Vision True
Penalties True
Composure True
Marking True
StandingTackle True
SlidingTackle True
GKDiving True
GKHandling True
GKKicking True
GKPositioning True
GKReflexes True
Release Clause True
Length: 89, dtype: boolGet the total number of null values
data_raw.isnull().sum().sum()76984
Get the number of null values for each of the columns
data_raw.isnull().sum()Unnamed: 0 0
ID 0
Name 0
Age 0
Photo 0
Nationality 0
Flag 0
Overall 0
Potential 0
Club 241
Club Logo 0
Value 0
Wage 0
Special 0
Preferred Foot 48
International Reputation 48
Weak Foot 48
Skill Moves 48
Work Rate 48
Body Type 48
Real Face 48
Position 60
Jersey Number 60
Joined 1553
Loaned From 16943
Contract Valid Until 289
Height 48
Weight 48
LS 2085
ST 2085
...
Dribbling 48
Curve 48
FKAccuracy 48
LongPassing 48
BallControl 48
Acceleration 48
SprintSpeed 48
Agility 48
Reactions 48
Balance 48
ShotPower 48
Jumping 48
Stamina 48
Strength 48
LongShots 48
Aggression 48
Interceptions 48
Positioning 48
Vision 48
Penalties 48
Composure 48
Marking 48
StandingTackle 48
SlidingTackle 48
GKDiving 48
GKHandling 48
GKKicking 48
GKPositioning 48
GKReflexes 48
Release Clause 1564
Length: 89, dtype: int64Fork it (https://github.com/qualityjacks/Fifa19_Insights/fork)
Create your feature branch
git checkout -b featureCommit your changes
git commit -m 'some-text'Push to the branch
git push origin featureCreate a new Pull Request