Replacing numeric calculated columns with measures
Calculated columns and measures are different in the way they are stored and calculated within Power BI. When calculated columns are added to a data model, the values are calculated and stored for every row. This means that if there are 1,000 rows in a table and you add a calculated column for Margin (using a formula taking sales minus production costs), then for each row of the table, the data model would store a value for Margin. This data would be stored and updated whenever the model was refreshed so there are storage and performance implications when using many calculated columns.
Measures are not stored but instead calculated when used on the fly. Instead of creating a calculated column for Margin, you could instead create a measure that performs the same calculation, but the value would not be stored in the data model but instead be computed when the measure is used in a report visual.
For Power BI to use a measure to replicate what a calculated column does, the measure would have to be calculated row by row.
Normal Power BI statistical functions use a single table column as the only argument for their calculations, functions such as SUM, COUNT, and AVERAGE.
The X functions
Statistical functions also have a separate X function, such as SUMX, COUNTX, and AVERAGEX, and these take two arguments, a table and an expression. The expression is calculated in row context for the provided table.
We could recreate our gross revenue calculation using the SUMX function like so:
- Original:
Gross Revenue = Sales[Quantity] * Sales[UnitPrice]
- SUMX:
Gross Revenue X = SUMX(Sales, Sales[Quantity] * Sales[UnitPrice])
This gives us the following table:
Figure 7.10 – Both calculations return identical results
Note that there is a cost associated with the X functions, as we do not have the data already calculated, so Power BI will have to calculate it every time. But in many situations, the savings in model size, especially the model’s memory footprint, may be worth it.
So, when do we use a calculated column and when do we use a measure?
When to use calculated columns
Calculated columns are ideal for fixed values. For example, if you want the fixed value of the year number parsed out from an entire column storing the date, the value in the calculated column would always be the same based on the date. Parsing and calculating the year value 2022-06-01 would always return 2022.
Likewise, if you want to add any text values to a table, then it’s best to use a calculated column. For example, if you wanted to parse the name of the month from the date, then the calculated value would be the same based on the date. Parsing and calculating the text month value 2022-06-01 would always return July.
Calculated columns must be used if you plan to filter by a value stored. You can add a calculated column to a visual, page, or all pages filter in a Power BI report.