The data tables and charts feature allows for basic data visualisation in the Opus 2 Platform. The feature uses derived data from worksheet content configured in either tabular or graphical form. No data can be added directly to a table or chat, it must be added to a relevant worksheet before it can be used for data charting purposes.
TABLE OF CONTENTS
- Charting and visualisation principles in Opus 2 platform
- Configuring a data table
- Configuring a chart
- Displaying data tables and charts
Charting and visualisation principles in Opus 2 platform
The results that can be achieved with data tables and charts are sometimes similar to what might be done in a spreadsheet tool such as Excel, but the execution is different.
In a spreadsheet, the user starts with the data and performs manipulations on it such as sorting, filtering, adding rows or creating charts in order to generate the desired derived outputs. If they have separate data in a new sheet, then all the manipulations need to be repeated. The repetition can be avoided to some extent by defining functions and reusable macros that are then re-run when data changes.
By contrast, in Opus 2 platform, the process starts at the other end: rather than manipulating data directly, the user starts by configuring the operations that will be performed on the data. A preview is shown of what the steps would yield given the current contents of the source worksheets, but should this content ever change, the construction of the derived output tables and charts is automatic.
It can be useful to think of the tables and charts feature as allowing the user to define processing "pipelines". Once the pipeline has been defined, content can be pushed through it repeatedly without further intervention. Pipelines like this can also be stored in project templates for re-use across any project created from that template.
The starting point for any tables and charts pipeline is choosing the worksheet holding the source data in order to generate a data table. Each table is based on a single worksheet but it may bring in multiple different fields from that worksheet.
If there is a need to show data from multiple worksheets then this must be done at the worksheet configuration level by establishing connections between worksheets with remote record fields and bringing in the required data from remote worksheets with derived fields.
Once a worksheet is selected, the next choice is how the rows of the data table should be related to the records in the worksheet. An initial table or column needs to be defined with one of three principal display options as well as optional configurations based on data type and filters if needed.
The data table must be opened in edit mode in order to add configuration options display resulting data. The following options are available:
Display table by:
All rows - One row in the table for each record in the worksheet. This can be used to produce charts that show each record in the worksheet and allows numerical data to be brought into the table as-is where it can be manipulated with numerical operations. If the initial source column is configured to display table by all rows, any subsequent columns brought in from the source worksheet must be of type numeric or currency.
Grouped by data type - Multiple records in the worksheet 'grouped' into a single row in the table where the allocation of records to rows is determined by a selected property of the records. The following data types may be grouped in this way:
Short text
Number
Currency (single currency configuration only)
Choice
Date
Single remote record
Date added
Added by
Date updated
Updated by
Combine all rows - All rows in the worksheet aggregated into the same row in the data table. In this case the table has only one row. This is useful for showing totals or averages of numerical data in worksheets.
Options:
Numbers rounded to x decimal places. Optional. Can be set to 0, 1 or 2.
Ignore unused values. Applies only to data grouped by type choice or single remote record. If this is selected, an option that does not have any associated records will not be displayed in the table. If not selected, the option will be displayed and associated records will be set to '0'
Define number and date ranges. Applies to data grouped by type number, currency (single currency configuration only), date, date added, date updated. Allows numeric and date-related data to be aggregated into ranges. See 'Specifying rows aggregations' below.
Filter: Optional filtering rules can be applied to the source worksheet data to further refine the data displayed in the table.
In order to aggregate multiple worksheet records into a single table row, the records must share some property in common. The simplest case of this is where the aggregation is performed on a choice field. In that case, the table has one row for each possible value of the choice field in the worksheet.
Other cases work on numeric or data fields and require the data to be "bucketed" into disjoint ranges. Each such range then gives a single rows in the worksheet.
For bucketing on dates, only a bucket size is required. This can be measured in days, weeks, months or years. The boundaries of the buckets are then based on the natural boundaries such as the start of a year, the first day of the month, or the start of a week. Midnight between Saturday and Sunday is considered the start of a new week when a week-based range is defined.
For bucketing numerical values there are no such natural boundaries. Instead, the configuration allows setting either the total number of buckets or the bucket size. Where the total number is set, then the range from the smallest to largest values present in the data is divided into that many uniform intervals. Where the size is set, the bucket boundaries are at fixed intervals of this size from the smallest value in the data. Alternatively, the user can set the bucket boundaries manually.
As well as being sourced from worksheet data, columns can be defined that take their content from other columns in the table. Each cell in the new column takes its value from one or two other cells in the same row depending on the operation being applied. For a single source column the only operations supported are the two unary numerical operations negation and taking the absolute value. For two source columns the new column can be populated with the sum, difference, product or quotient of the sources values corresponding to the standard binary mathematical operators plus, minus, times, and divided by.
Where tables are defined to have fewer rows than the source worksheet such that multiple rows in the worksheet contribute to a single row of the table, it is often useful to subdivide the values according to some other property of the source records before aggregating. This means that a single source field in the worksheet can give rise to multiple columns in the table. Disaggregation like this is only supported on choice fields.
As an example to illustrate a typical use case, consider a worksheet containing the price of apples of different varieties in different supermarkets sampled at different times. Each rows contains one data point for one variety of apple in one supermarket at a particular time. Now consider a table derived from this worksheet by aggregating on variety. There would be one row or each variety of apple in the worksheet. Adding a column with the aggregation function set to "average" would then yield a table that showed the average price of each apple variety in the worksheet. But this average would lump samples from all the different supermarkets together. What if you want to split them out by supermarket? That is the purpose of the column splitting operations. The configuration requires a choice field to be selected from the original worksheet and then creates a separate column for each value of this choice field. Cells in these columns use the specified aggregation function but only include the rows from the worksheet where the choice field in the record takes the value corresponding to that column.
Once a data table has been configured, this can be used as the source for a visualisation displayed in a chart. Each chart takes its data from a single table and may show one or multiple columns from that table.
Depending on the chart type, colours may be taken from the definition of the worksheet, such as the colours assigned to choice values, or may use the colours allocated to table columns directly. If no colours are supplied, then successive elements from a standard palette are used.
The following types of charts are supported:
Pie and Donut charts show a circle or annulus split into radial segments where the segment sizes match the values in the rows of the table. If multiple colunns are selected then these are shown as successive concentric rings.
Bar charts show horizontal or vertical bars based on the cell values in the selected column. Where the source is a split-column, each bar is split into separate segments according to the individual column values. If multiple columns are selected then each column gives one bar for each row in the table.
Line charts: these are like bar charts except the values are represented as points joined by lines.
Once a data table or a chart has been configured, there are two spaces within a project where this may be displayed: on the internal project dashboard, and on a portal page.
On the project dashboard, data from a table and any chart that has been configured can be displayed in the Chart and Data tile widgets. For more information about how to configure project dashboards, please refer to the Project dashboards section of this document.
On a portal, charts may be displayed on any portal page via a Chart component. For more information about how to configure a portal page, please refer to the portal page editor section of this document