In [1]:
import pandas as pd

Pie Chart

Here we are making simple pie chart to see the top 5 makers in terms of price.

In [55]:
## Group the data according to maker and mean them.
data_g = data.groupby('make').mean()
In [56]:
data_g.head()
Out[56]:
symboling wheel-base length width height curb-weight engine-size bore stroke compression-ratio horsepower peak-rpm city-mpg highway-mpg price
make
alfa-romero 2.333333 90.566667 169.600000 64.566667 50.000000 2639.666667 137.333333 3.206667 2.943333 9.000000 125.333333 5000.00 20.333333 26.666667 15498.333333
audi 1.500000 102.733333 184.766667 68.850000 54.833333 2758.666667 130.666667 3.180000 3.400000 8.633333 114.500000 5500.00 19.333333 24.500000 17859.166667
bmw 0.375000 103.162500 184.500000 66.475000 54.825000 2929.375000 166.875000 3.473750 3.167500 8.575000 138.875000 5068.75 19.375000 25.375000 26118.750000
chevrolet 1.000000 92.466667 151.933333 62.500000 52.400000 1757.000000 80.333333 2.990000 3.083333 9.566667 62.666667 5300.00 41.000000 46.333333 6007.000000
dodge 1.000000 95.175000 161.450000 64.212500 51.775000 2146.375000 103.250000 3.102500 3.362500 8.763750 84.375000 5375.00 28.500000 34.625000 7790.125000
In [57]:
## Sorting the data in order to fetch top 5
data_g.sort_values('price',ascending=False,inplace=True)
data_g_top5 = data_g.head(5)
In [58]:
data_g_top5
Out[58]:
symboling wheel-base length width height curb-weight engine-size bore stroke compression-ratio horsepower peak-rpm city-mpg highway-mpg price
make
jaguar 0.000000 109.333333 196.966667 69.933333 51.133333 4027.333333 280.666667 3.600000 3.700000 9.233333 204.666667 4833.333333 14.333333 18.333333 34600.000000
mercedes-benz 0.000000 110.925000 195.262500 71.062500 55.725000 3696.250000 226.500000 3.605000 3.432500 14.825000 146.250000 4487.500000 18.500000 21.000000 33647.000000
porsche 3.000000 90.750000 168.900000 65.825000 51.250000 2772.500000 183.250000 3.790000 2.952500 9.500000 191.000000 5800.000000 17.500000 25.500000 31400.500000
bmw 0.375000 103.162500 184.500000 66.475000 54.825000 2929.375000 166.875000 3.473750 3.167500 8.575000 138.875000 5068.750000 19.375000 25.375000 26118.750000
volvo -1.272727 106.481818 188.800000 67.963636 56.236364 3037.909091 142.272727 3.662727 3.147273 10.227273 128.000000 5290.909091 21.181818 25.818182 18063.181818
In [59]:
## Simple pie chart for prices of makers of cars.
## autopct will add percentage of each maker
plt.pie(data_g_top5['price'],autopct='%1.1f%%')
Out[59]:
([<matplotlib.patches.Wedge at 0x18708f2cdd8>,
  <matplotlib.patches.Wedge at 0x18708f31588>,
  <matplotlib.patches.Wedge at 0x18708f31d30>,
  <matplotlib.patches.Wedge at 0x18708f38518>,
  <matplotlib.patches.Wedge at 0x18708f38cc0>],
 [Text(0.800533,0.754418,''),
  Text(-0.687935,0.858339,''),
  Text(-0.951503,-0.551944,''),
  Text(0.230605,-1.07556,''),
  Text(1.01549,-0.422827,'')],
 [Text(0.436654,0.411501,'24.1%'),
  Text(-0.375237,0.468185,'23.4%'),
  Text(-0.519002,-0.30106,'21.8%'),
  Text(0.125785,-0.586667,'18.2%'),
  Text(0.553903,-0.230633,'12.6%')])

Let's make the pie chart more informative by adding labels, titles.

In [60]:
## Making a pie chart for top makers of car
plt.pie(data_g_top5['price'],
        autopct='%1.1f%%',
        labels=list(data_g_top5.index),
        startangle=90,shadow=True)
## add the title to the graph
plt.title('Top 5 makers with highest prices')
## To make it look like a circle.
plt.axis('equal')
Out[60]:
(-1.1144632903027005,
 1.1194551024984358,
 -1.1252202487616703,
 1.1012009642267462)

Another way of plotting a pie chart

This way is to use dataframe directly and use kind attribute of matplotlib.

In [61]:
## Plotting top makers of car
data_g_top5['price'].plot(kind='pie',
        autopct='%1.1f%%',
        labels=list(data_g_top5.index),
        startangle=90,shadow=True)
## Adding title to plot
plt.title('Top 5 makers with highest price')
## To make it look like a circle. 
plt.axis('equal')
Out[61]:
(-1.1144632903027005,
 1.1194551024984358,
 -1.1252202487616703,
 1.1012009642267462)

Heat maps

A heat map (or heatmap) is a graphical representation of data where the individual values contained in a matrix are represented as colors. It is used to see the relation between numerical values or categorical variables like for example there are two categorical variables then we can check how the categories in one variable is related to other variable categories in terms of color.

In [62]:
array = np.array(data_g)
plt.imshow(array)
Out[62]:
<matplotlib.image.AxesImage at 0x18709f971d0>