Exploratory Data Analysis in Python — Part 2(Advanced)
Hey Learners, This is the Part — 2 of Exploratory Data Analysis. In part-1 you have seen the theoretical part of EDA and now, this article will go through the graphs and visualization which are used in EDA.
This article is based SEABORN on Automobile data set, You can get the dataset Automobile From here.
Let’s Get Started
When we deal with data, we often look into how variables are distributed.
Importing the Libraries
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
warnings = 'ignore'
Importing the Data Set
Automobile = pd.read_csv('Automobile.csv')
Automobile.head()
Moving to the visualization
#1 Which are the most sold Body-style’s in the cars?
sns.countplot(Automobile[‘body_style’] #We are using Seaborn
Sedan is having the highest Count in the Dataset
#2 Which type of car has the highest Horsepower and where will be its engine?
We need to use two columns i.e. Body style and horsepower to compare the both and to find the solution
sns.barplot(Automobile[‘body_style’], Automobile[‘horsepower’], hue=Automobile[‘engine_location’])
Convertible Type cars having rare engine has the highest horsepower
#3 What is the average MPG(Miles per Gallon) of cars in cities ?
sns.displot(Automobile['city_mpg'])
plt.show()
The highest no. of the MPG is around 25, so the maximum cars are around 25 MP/G
#4 What is the relation of Horsepower and Engine_size?
x = Automobile['engine_size']
y = Automobile['horsepower']
sns.jointplot(x,y)
We can see as the horsepower is increasing , engine size is also increasing.
#5 What is the relation Between Normalized losses, engine size and horsepower?
sns.pairplot(Automobile[[‘normalized_losses’,’engine_size’, ‘horsepower’]])
#6 What is the fuel types of all the engine Size?
sns.stripplot(Automobile[‘fuel_type’], Automobile[‘engine_size’])
Maximum are Gas.
#7 Which fuel type car has more horsepower and what will be its no. of doors?
sns.boxplot(Automobile['number_of_doors'], Automobile['horsepower'], hue=Automobile['fuel_type'])
We can see cars with 2 doors and gas type has the more horsepower than others.
#8 What is the correlation of the Data
Automobile.corr()
Representing the correlation in Heatmap
sns.heatmap(Automobile.corr())
To revise all the theoretical part you all can refer to my Exploratory Data Analysis in Python — Part1 (Basic)
Do refer to my Profiles for the Further Content.!!
Github : KartikAggarwal1305 (Kartik Aggarwal) (github.com)
Linkedin : Kartik Aggarwal | LinkedIn
Medium : Kartik Aggarwal — Medium