0️⃣
Smart Plot
Smart Plot is the key feature of HEARTCOUNT where typical code-heavy visualization tasks can be executed easily and come in handy. To visualize data in a way you want to examine it, all you have to do is to simply select which variable goes x-axis and which goes to y-axis and choose the type of visualizations you wish to use.
Smart Plot provides a variety of visualizations of your data, which include such below.
Data Type | Available Visualization |
---|---|
Between numeric and other numeric | Scatterplot |
| Trend Line(regression line) |
| Heat Scatter |
Between categorical and numeric | Bar (average or sum) |
| Stacked Bar |
| Stacked Area |
| 95% confidence interval |
| Boxplot |
Between categorical and categorical | Ratio Bar Chart |
| Stacked Count Bar |
Between time series and numeric | Time Series Line Chart |
| Stacked Area |
| Trend Line |
| Forecast |
Smart Plot consists of four key sections. You can easily create a suitable visualization of your dataset using these sections without writing a single line of code.
Area | What's It For |
---|---|
1. Main Area | This is where a plot will be displayed. You will be able to interact with plot elements such as data points in a scatterplot to further investigate the dataset to find an answer to your analytic inquiries. |
2. Side Menu | This is where you may configure Smart Plot's parameters, such as which variables to use to change the colors or sizes of data points, or to filter the data. Also, you could choose |
3. Variable Selection | This is where you can choose which variable to create a plot suitable for your analytic purpose. As with creating a data visualization in a code-heavy setting(R/Python), you must choose which variables you would place in the x and y axes and which to use for subgrouping or faceting. |
4. Visualization Type | This is where you can choose which type of visualization you would use to plot the data. Given the variables for the x and y axes, the Visualization Type tab will provide you several options you could choose from to correctly visualize your data. |
This section will discuss the many sorts of visualizations possible in relation to the specified x and y axes variables.
Two Numeric Variables
Category on X & Numeric on Y
Time Series on X & Numeric on Y
When you put numeric variables on both axes, a simple scatterplot will be displayed on Smart Plot. Also, a Pearson correlation coefficient will be given as a basic information on these two variables.
- There are two available types of additional visualizations for the scatterplot.
- You may choose to display a trend line , which is basically a regression line. It displays how the x and y variables are linearly correlated.
- On the left of the trend line icon lies a number. It is a Pearson correlation coefficient that shows how much those variables are linearly correlated.
- Pearson correlation coefficients are on or between −1 and +1.
- If it equals to zero, it means they are not linearly correlated at all.
- If it equals to plus one, it means they are strongly positively correlated.
- If it equals to minus one, it means they are strongly negatively correlated.
- If you drag to select some data points of interest, it will show you the linear relationship of only the selected data.
- The other available option is heatmap (often known as heatscatter).
- This visualizes around which area data points cluster together the most.
- The gradient color scale will change in accordance with the color settings on right top of HEARTCOUNT.
There are two major ways to visualize the data when you put a categorical variable on x-axis and a numeric variable on y-axis. One is to visualize individual data points and the other is to create subgroups based on the categorical variable on the x-axis.
- Scatterplot : This is equivalent to a scatterplot for the scatterplot with two numeric variables, but data points are aligned in accordance with the given x-axis.
- 95% Confidence Interval : This shows 95% confidence interval for each group of a category variable on x-axis. A box placed upon the data points of a group represents the confidence interval of a sample mean for the group. When you click the box, you will be shown another box that is long and horizontal that you can use to compare with other groups' confidence interval boxes. You can imagine conducting an eye-level two-sample t-test.
- Distribution : This is one of the two ways to visualize how data points are distributed across y-axis by each group of a categorical variable on x-axis. Length of a line represents the number of data points lying upon the line.
- Heatmap : Heatmap, just as Distribution does, represents the distribution of data points but with gradient colors. Note that you can change the color scale with the color scale icon on top right of HEARTCOUNT.
- You may create subgroups based on the categorical variable on the x-axis by using subgroup tab. After so, Smart Plot will provide a set of available options for grouped data point visualization.
- There are five metrics you may choose to put on a y-axis.
- Average of a numeric variable
- Sum of a numeric variable
- Record count: how many records are in each group
- Global ratio: how much proportion each group takes within the given dataset
- Local ratio: how much proportion each subgroup takes within the each group. This is useful when you divide each group into another subgroup.
- Here are the list of available visualization options.
- Bar chart : This is a simple bar chart where heights are proportional to the given metric on a Y-axis.
- You may use other variable to create subgroups within each group instead of grouping the entire group. Each bar on the x-axis will be divided into multiple subgroup bars.
- Stacked Bar Chart : Another way to use the subgroup feature is with a stacked bar chart. Each bar on the x-axis will be divided into multiple subgroup bars, and the subgroup bars will be stacked so that the original bar can be better understood.
- Stacked Area Chart : A stacked area chart is similar to the stacked bar chart, except that subgroup bars are connected together so that the levels/values of each subgroup vary over the given x-axis variable can be better understood. Below is displayed how easily local ratio of subgroups within each category can be calculated.
- Line Chart : This is a traditional line chart where seasonal patterns and trends can be explored if your dataset contains a temporal(time, date, etc) variable.
You also can put a time-series variable on x-axis.
- If your dataset contains a time series variable, you can map it on the x-axis. Smart Plot will then display a line chart.
- Smart Plot will generate a monthly time interval for you. To switch to a different time interval, such as weekly or yearly, use the small clock-like iconnext to the x-axis tab.
- You can use automatically derived time series variables such as days of week, day type(weekdays or weekend), or so to create a plot. This will enhance your understanding of the relationship between the time series variable and others.
- You can check how the variable on the y-axis has evolved over the x-axis by using the trend line feature.
- Smart Plot also supports two simple forecasting models, namely the max entropy model and the least square method.
Smart Plot offers a facet feature. It allows you to divide a single plot into multiple charts based on a facet variable in order to better understand the relationship between the x and y variables. You can also use every other feature in Smart Plot within each facet plot.
Categorical variables with fewer than 11 groups can currently be used as a facet variable.
Last modified 7mo ago