Saturday, January 21, 2017

Diving deep into data analytics with Power BI: Aggregate based Visualizations using MS Excel data

This tutorial demonstrate creation of aggregate based visualizations for MS Excel data using Power BI. What we mean from aggregate based visualization here is the visualization techniques that can be used to display characteristics of certain group using aggregate of characteristics of individuals belonging to that group. In this tutorial, we will gain insight into which Sport had the most of the top earners in 2014. This blog is in continuity of a series of blogs that started with a very introductory work using Power BI and it is available at Diving deep into data analytics with Power BI: Tabular Visualization using MS Excel data, yes the starting pint. In this blog, we will make use of Power BI features that are available online for creation of reports. Similar work can also be done using the Power BI Desktop version.

In this tutorial, we will be using "Top Athlete Salaries" data sheet from Tableau Sample Data Sets: Global Sport Finances dataset. Details about dataset can be found at Diving deep into data analytics with Power BI: Tabular Visualization using MS Excel data. Before we move forward with our tutorial, please make sure that you have signed-up for Microsoft Power BI account. For Microsoft Power BI account creation, visit Sign in | Microsoft Power BI. After login, you will land to a page similar to the one shown in Figure 1.

Figure 1: Microsoft Power BI Online Account Welcome Page
On the top right corner of the page, you will find a Create menu, you can use this menu to create Dashboard, Report, Dataset, and Streaming dataset as shown in Figure 2.
Figure 2: Create Options Power BI
We will start with creating a dataset using our MS Excel file from Tableau Sample Data Sets: Global Sport Finances dataset. Click on the Create -> Dataset menu. You will be moved to Get Data page as shown in Figure 3.


Figure 3: Get Data Page
Click on the Get link under the Import or Connect to Data -> File area. The new page will ask you for selecting the source for the dataset. Power BI given us many options covering Local File, One Drive (both business/personal), SharePoint Team Sites as well as Databases and More as shown in Figure 4. We will confine our discussion with uploading MS Excel with Local File option for creating a dataset. Click on Local File option and select the MS Excel file from local drive.


Figure 4: Dataset Sources Options
After selecting the MS Excel file, Power BI will ask for, either you want to import the Excel data in Power BI or you want to Upload the Excel file to Power BI as shown in Figure 5. To uploading the data, click on Upload. link. After successful completion of upload, the file will be available under Workbooks link as shown in Figure 6. This file can be open in Power BI environment and manipulated using Excel Online.


Figure 5: Import vs. Upload Options for Local File
Figure 6: Workbooks in Power BI
 The import option requires data in file to be in a table format. For importing data into Power BI, open the Excel file, select the data that you want to import and press Ctrl + T keys. A create table dialog will appear as shown in Figure 7. Give you table a name as shown in Figure 8. Follow the same step of creating table for all data sheets. Save the file.


Figure 7: Create Table Dialog After Selecting the Data and Pressing Ctrl+T
Figure 8: Provide Table Name
Follow the steps shown in Figures 2-5 and click on Import link. Select the MS Excel file with table. It will import the file into Power BI as a dataset. After successful import, dataset will be visible in Power BI environment as shown in Figure 9.


Figure 9: Imported Dataset
For creating our report, click on Create -> Report menu as shown in Figure 2. Select the dataset for the report from the list of available datasets as shown in Figure 10 and click create link.

Figure 10: List of Available Dataset for Report
Now coming back to the question that we want to respond. The question is "Sport-wise Top Earners in 2014". Power BI gives us many options to respond to this question. I will use few visualizations that are easy to build and comprehend, however, it is not the exhaustive list and you may find many creative people doing better job on this question using different techniques and tools. Furthermore, there may be a discussion about the suitability of used visualization for the purpose. Well, they could be or could not be, but we will not get into this discussion and will keep our focus to explore Power BI features.

I will start with Treemap. Add Treemap to your designer. Add Sport field into Group and Athlete field into values. Make sure that values has count of Athlete field. Please do also format the Treemap and add appropriate title. The visualization in designer mode should look like as shown in Figure 11. Don't forget to save you report.

Figure 11: Treemap Visualization to Identify Sports with Top Earners in 2014
Once the Treemap visualization is complete, it can be published on the web. The details about how to publish the Power BI report on the web refer to Diving deep into data analytics with Power BI: Tabular Visualization using MS Excel data. The Treemap visualization of this tutorial is available at Sport-wise Top Earners in 2014. It is also embedded on this blog below for inline view.



Second option will be Funnel visualization. Add Funnel to your designer. Add Sport field into Group and Athlete field into values and color saturation. Make sure that values and color saturation have count of Athlete field. Please do also format the Funnel to adjust font and add appropriate title. The visualization in designer mode should look like as shown in Figure 12. Don't forget to save you report.


Figure 12: Funnel Visualization to Identify Sports with Top Earners in 2014
The funnel visualization of this tutorial is available at Sport-wise Top Earners in 2014. It is also embedded on this blog below for inline view.



Third option will be Donut chart visualization. Add Donut chart to your designer. Add Sport field into Group and Athlete field into values. Make sure that values has count of Athlete field. Please do also format the Donut chart to adjust font and add appropriate title. The visualization in designer mode should look like as shown in Figure 13. Don't forget to save you report.


Figure 13: Donut Chart Visualization to Identify Sports with Top Earners in 2014
The donut chart visualization of this tutorial is available at Sport-wise Top Earners in 2014. It is also embedded on this blog below for inline view.



I hope you enjoyed this tutorial explaining to beginners, How to create aggregate based visualizations using MS Excel data on Power BI online version. The next article in this series can be found at Diving deep into data analytics with Power BI: Map Visualization In Action. The previous as well as first article of this series can be accessed at Diving deep into data analytics with Power BI: Tabular Visualization using MS Excel data.