Just the Basic Stacks

Time for another episode of Hashtag TheEmailThatBecameABlogPost. Today, it’s some funky labelling of a stacked column chart in Deneb/Vega-lite.

The request was for a stacked column chart with grand totals per column, and data labels per section showing the percentage of the total of each, centre-aligned within the section.

A stacked column chart is pretty easy in Vega-lite (there’s an example in the doco after all). But for this chart, we’re going to have to use a few transforms to:

  • Calculate the percentages of each category and the overall total
  • Calculate some values to help centre-align our data labels

Usually, you’d encode a colour (or color) channel to automagically stack values (see here). We’re going to do this manually to allow us to calculate the data label values and positions.

Data Prep

The test data for the chart looks something like:

DateCategoryValue
2021-01-01A12
2021-02-01A5
2021-01-01B1
2021-02-01B3
2021-01-01C10
2021-02-01C8

First, we’ll need to aggregage our values into a single sum per Category per Date:

"transform": [
        {
            "aggregate": [
                {
                    "op": "sum",
                    "field": "Value",
                    "as": "sum_value"
                }
            ],
            "groupby": [
                "Category",
                "Date"
            ]
        },
...

We need the grand totals per Date for one set of labels, and also to calculate the percentage of each Category per Date. We’ll use a joinaggregate:

    {
        "joinaggregate": [
            {
                "op": "sum",
                "field": "sum_value",
                "as": "total_value"
            }
        ],
        "groupby": [
            "Date"
        ]
    },

Now we can calculate the values for the percentage labels:

    {
        "calculate": "datum.sum_value / datum.total_value",
        "as": "percent_value"
    },

To centre align these percentage label, we need to find the mid point of each group. We can stack by Date and sort by Category:

    {
        "stack": "sum_value",
        "as": [
            "mid_start",
            "mid_end"
        ],
        "groupby": [
            "Date"
        ],
        "sort": [
            {
                "field": "Category",
                "order": "descending"
            }
        ]
    },

Finally, we’ll calculate the centre of each section to position the percentage label nicely:

    {
        "calculate": "datum.mid_start + (datum.mid_end - datum.mid_start) / 2",
        "as": "mid"
    }

Layout

We’ll need three layers, one for the columns, one for the percent labels and one for the overall totals.

     "encoding": {
        "y": {
            "title": "Percent & Total"
        },
        "x": {
            "field": "Date",
            "type": "ordinal",
            "timeUnit": "yearmonth",
            "title": "Date"
        }
    },
    "layer": [
        {
            "mark": {
                "type": "bar"
            },
            "encoding": {
                "y": {
                    "field": "sum_value",
                    "type": "quantitative"
                },
                "color": {
                    "field": "Category"
                }
            }
        },
        {
            "mark": {
                "type": "text",
                "color": "white"
            },
            "encoding": {
                "y": {
                    "field": "mid",
                    "type": "quantitative"
                },
                "text": {
                    "field": "percent_value",
                    "format": ".1%"
                }
            }
        },
        {
            "mark": {
                "type": "text",
                "color": "black",
                "dy": -10
            },
            "encoding": {
                "y": {
                    "field": "total_value",
                    "type": "quantitative"
                },
                "text": {
                    "field": "total_value"
                }
            }
        }
    ]

Putting it all together gives:

{
    "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
    "data": {"name": "dataset"},
    "transform": [
        {
            "aggregate": [
                {
                    "op": "sum",
                    "field": "Value",
                    "as": "sum_value"
                }
            ],
            "groupby": [
                "Category",
                "Date"
            ]
        },
        {
            "joinaggregate": [
                {
                    "op": "sum",
                    "field": "sum_value",
                    "as": "total_value"
                }
            ],
            "groupby": [
                "Date"
            ]
        },
        {
            "calculate": "datum.sum_value / datum.total_value",
            "as": "percent_value"
        },
        {
            "stack": "sum_value",
            "as": [
                "mid_start",
                "mid_end"
            ],
            "groupby": [
                "Date"
            ],
            "sort": [
                {
                    "field": "Category",
                    "order": "descending"
                }
            ]
        },
        {
            "calculate": "datum.mid_start + (datum.mid_end - datum.mid_start) / 2",
            "as": "mid"
        }
    ],
    "encoding": {
        "y": {
            "title": "Percent & Total"
        },
        "x": {
            "field": "Date",
            "type": "ordinal",
            "timeUnit": "yearmonth",
            "title": "Date"
        }
    },
    "layer": [
        {
            "mark": {
                "type": "bar"
            },
            "encoding": {
                "y": {
                    "field": "sum_value",
                    "type": "quantitative"
                },
                "color": {
                    "field": "Category"
                }
            }
        },
        {
            "mark": {
                "type": "text",
                "color": "white"
            },
            "encoding": {
                "y": {
                    "field": "mid",
                    "type": "quantitative"
                },
                "text": {
                    "field": "percent_value",
                    "format": ".1%"
                }
            }
        },
        {
            "mark": {
                "type": "text",
                "color": "black",
                "dy": -10
            },
            "encoding": {
                "y": {
                    "field": "total_value",
                    "type": "quantitative"
                },
                "text": {
                    "field": "total_value"
                }
            }
        }
    ]
}

The final result: