Big Data Analytics Visualization and Data Exploration - A Guide
Data Visualization and exploration tasks are important part of steps performed by every data scientist. As they say - there are lot of secrets lying undiscovered in data and one of the first step to go about finding them is to start with exploratory data analysis.
This article looks into some of the tools available for data visualization and exploration.
Big Data Analytics Visualization and Exploration Tools
Traditionally one had to rely on plain sql queries for data exploration with zero visualization support. But luckily now there are lot of tools available now to help
Data on a Single computer
-
R - One of the reason for popularity of R among data scientists is its ability to quickly visualize data, though it is limited to in-memory data loaded on a data engineer's computer.
-
Matlab/Octave - lot of people have been using them successfully to explore and visualize data of small size.
-
Excel -- if your data visualization or exploration needs are limited, you can even use the ever present excel
Data on Cluster
If your data is lying in a hadoop or spark cluster, or on a nosql storage, you will need tools that can query distributed data and create visualizations.
Due to their growing need, lot of projects have emerged to fulfill this need. Here are the most popular and useful ones:
-
Jupyter Notebook (previously called Ipython)