Step by step Analytic models creation and usage

Follow these simple steps to create your Self-Service BI 'Analytic Model' based on your data in minutes. Kyubit analytic models are designed to be created with minimum of user interaction, easy to comprehend and utilize.


Create Model from CSV File

Data stored in CSV file, delimited by semicolon (;), can be quickly uploaded to Kyubit application and immediately is ready for the step of 'Data Definition'.
Numeric and date values in CSV file should respect current Kyubit BI server regional settings format for decimal separator and date format.



Create Model from Query Results

If the data for analytic model are based on existing data from relational databases, create SQL query that will be used to retrieve data for new model. Any valid SQL query could be used to run against SQL server or ODBC data sources registered in Kyubit application.



Data Definition

When source for new analytic model is defined (CSV file or query), next step is to define role of each column in provided data set on the 'Data' tab. Possible definition for the columns are 'Measure', 'Dimension', 'Date' and 'Details'.

Measure

Measure is value from our data set that will be used for aggregations while analyzing analytic model. Typically, using Sum, Avg, Count and other aggregations. Measure has to be of numeric values in Kyubit Analytic Model.

Dimension

Dimension values are used to slice and analyze measure values, used in required context. Analytic results are based on measures and dimension members on categories and series.

Date

Date column could be used as filter in analysis. Because analyzing business data mostly include filtering based on certain period, this is extremely useful element for analysis. Date cannot be used on analysis categories and series.

Details

Some information from data set are not good analytic material to be used on analysis categories and series, but should only be available when asked for details of aggregated data. For example, information such are 'address' and 'phone number', we are hardly going to use as aggregating data, but still we like to use as 'Details' on aggregated results. Setting such columns as 'Details' ensures more compact analytic model, faster processing and smaller size on the disk.




Column Caption and Description

While defining data for analytic model, optionally click on the column name and set its caption (if should be different from source) and Column Description.

Format Values

Measure values should be provided as pure numeric value. To configure measure to be presented as formatted numeric value (Currency for example), click on the "Format values" below column name and choose one of the formatting options or write your own.

Organize Structures Additionally

When each column definition is defined, we can immediately proceed to Processing of Analytic Model, but on the third tab 'Structure' there are additional options to organize analytic model structures. Instead to leave all dimension levels in 'Default dimension', we can create new dimensions and organize levels appropriately to be more comprehensive for end-users.

New Dimension

Create new dimension based on some topic (for example, 'Customer' or 'Product') and assign appropriate analytic levels to this dimension. This way you group analytic levels to certain topic and makes analysis more comprehensive.

New Hierarchy

When analytic levels are related to each other in parent-child relation, it is practical to organize them inside 'Hierarchies'. For example, Year-Month-Day or Continent-Country-City. This way makes analysis easy to drill-down data from higher view to more detailed values for end-user.




Processing of Analytic Model

While preparing analytic model, user can save and open same analytic model many times, which is in 'Unprocessed' status. When all data and structures are prepared, click the 'Process Analytic Model' button on the 'Structure' tab to actually start processing of analytic model data and make it ready for analysis and visualizations. Process could take from few seconds to several minutes (or more) depending on number of rows and columns defined for analytic model. All columns not required to be 'Dimension' set to 'Details' structure type, which will speed processing and save space in the models database.

Analytic Model Status

There are 3 analytic model status, 'Unprocessed', 'Processing' and 'Processed'. While analytic model is processing, it cannot be used by any user. If processing fails for any reason, it reverts to 'Unprocessed' status.

Log

After processing of analytic model, details of processing could be inspected by clicking on the 'Log' button in 'General' tab. If process of analytic model has failed, this is good place to start troubleshooting for the possible cause of the problem.

Schedule model updates

Analytic model could be updated automatically with 'Scheduled Jobs'. Go to Schedule -> Jobs and create new Job 'Process Analytic Model' and time preference for updates to occur. If Analytic model is based on Query, same query will be run against defined data source to bring fresh data into Analytic model. If Analytic model is based on CSV file, schedule job can be created only if CSV file is uploaded from shared folder and path begin with "\\..." (For example, \\SomeMachine\FolderWithData).





Set Offline / Online

After Analytic model is processed, it is ready to be used. For any reason, user with permissions could take Analytic model to 'Offline' and 'Online' status and manage its availability to end-users.


Usage in Analysis and Dashboards

After Analytic model is processed, it is ready for analysis and visualizations. Create analysis and reports in grid/chart view by adding measures and dimensions on categories/series axis or slicer axis. Use ordering, aggregate, isolating and other analytic actions to prepare analytic report of your interest. Once analysis is prepared it could be added to Dashboard and be visualized by many charts and visualization widgets, while at the same time data could be further analyzed by the end-user working with the dashboard (Drill-down, drill-thourgh, drill-by, slicing and other useful analytic actions).


Processing Performance with larger sets

To describe relation between amount of data and processing time, we prepared series of test on various machines to get and aggregated impression of the relation between number of rows and average processing time. For test purposes we used Microsoft Contoso BI Demo Dataset for Retail Industry sample database available to download.

Processing was done every time with 5 measures and 15 dimensions.