Getting started with Python tools of data visualisation

Rasika Muralidharan
3 min readMar 25, 2021

As we race into the 2020s, data is the new trailblazer. The commodity that every company tries to acquire, cherish and leverage. With many young folks venturing into the expansive world of data analytics and visualisation, let’s look at some of the python tools used for the same. Python is a leading program language used by data scientists for data-related functions adhering to it’s easy to learn the structure and extensive use.

Before we delve into the python libraries and their functions, here are some things are you will need to know.

  1. Understanding of basic python syntax

It is best to understand how python works and some basics like conditional statements, logical expressions, and loops. These are extensively used in data analytics and hence, essential to know. Don’t worry if you do not have a tight grasp; several resources are available to make these concepts easy to understand.

2. Being able to use Jupyter Notebooks

Using Jupyter Notebooks for all data visualisation related jobs will make the process infinitely easier, clean and fun! After all, that’s what coding is about!

3. Having a dataset to work on

Data visualisation is useless without data, obviously. Import some data sets into your Jupyter Notebook. You can find a ton of data sets on sites like Kaggle and official government websites too!

Now that we know some basic python concepts, our Jupyter Notebooks and our datasets, let’s dive into the python libraries! We’ve been evading the conversation about the libraries so far. So, let’s leave the intrigue and introduce them to you.

Meet Numpy, Pandas, Matplotlib, and Seaborn!

These are the most common libraries used for data visualisation; you’ll understand them soon.

Numpy and Pandas

NumPy is a Python library that provides a simple yet powerful data structure: the n-dimensional array. This is the foundation on which almost all the power of Python’s data science toolkit is built, and learning NumPy is the first step on any Python data scientist’s journey.

But what exactly is NumPy used for?NumPy is used to deal with array type data. Python already has array functions, but they are slow to use. Numpy works 50x faster than regular arrays.

How do we use NumPy?To access NumPy and all its amazing functions, you need to install it. If you have Python and PIP install, then installing NumPy is easy.

Go to your command line and type

pip install numpy

Once installed, you can use the command

import numpy

in your code/ program to use it.

Pandas:

Pandas is a library used for analysing data. Its functions include cleaning, analysing, exploring and manipulating data.

What can Pandas do?

It can answer the following questions for you:

1. Is there a correlation between these albums

2. What is the max value

3. What is the min value

4. What is the average

How do we use Pandas?

To access Pandas and all its amazing functions, you need to install it.If you have Python and PIP install, then installing Pandas is easy.

Go to your command line and type

pip install pandas

Once installed, you can use the command

import pandas as pd

in your code/ program to use it.

Now, you can import your dataset and start experimenting with it.

df= pd.read_csv(<dataset name>) // this is how you import a dataset using pandas

Matplotlib

Matplotlib is a low-level graph plotting library in python that serves as a visualisation utility.

Matplotlib allows you to make plots and graphs, along with facilities to label and name them.

How do we use Matplotlib?

To access Pandas and all its amazing functions, you need to install it. If you have Python and PIP install, then installing Pandas is easy.

Go to your command line and type

pip install matplotlib

Once installed, you can use the command

from matplotlib import pyplot // pyplot is the most frequently used

in your code/ program to use it.

Below are some resources linked that will give you a greater understanding of the capacities of matplotlib.

Seaborn

Seaborn is a library used for making statistical graphs in Python. It builds on top of matplotlib and integrates closely with pandas data structures.

This is how we use seaborn in our code/ programs:

import seaborn as sns

Using seaborn you can create scatter plots, bar graphs, histograms and more. The list goes on and on!

Now that we’ve explored some basic functions, all you need is a data set, and you are ready to experiment! Remember, data visualisation and analytics is a skill. The more you practice and experiment, the more there is to learn and the easier it to gain confidence and, eventually, expertise! Happy coding!

--

--