Version 5.3.4.0: askiaanalyse

This post is the second in a series of two that detail the key features in version 5.3.4.0 of askiadesign & askiaanalyse.

Much development work has been done in 5.3.4.0 which bring about a whole host of great new features. A detailed list of these can be found in the version roadmap. Below are five of these features.

Table of contents:

Improvement of inverted data format

Inverted data is stored and read variable by variable rather than by respondent. This generally speeds up processes such as downloading, expor ting, importing and recalculating. It is another data format which stores data in a .dat folder rather than in just a .qes file.

The ‘Re-invert database’ option in Askia Analyse is used to invert the data and the ‘Open an inverted survey’ option is used to access this data format in Askia Tools.

In Analyse 5.3.4.0 a number of improvements have been made to this format:

We have decided to rename the old files and not use the number of responses as an extension – this was causing unnecessary problems when a question was recalculated or if the max number of responses was manually changed in Design. The extension of the inverted data files is always .dat.

The developed questions in a loop are no longer stored individually – the grayed questions always contained the information so we thought the cost of reading the whole grayed data for that question was a small price to pay compared to store all the data twice. This significantly decreases the number of files and lets you easily change your mind about the “Develop level” option in Design.

We store the system data – or peri-data – such as start time, end-time, IP-address, completion, …

The data is compressed which can gain up to 95% diminution of file size ( versions prior to 5.3.4.0 were a simple uncompressed copy of the memory)

We have backwards compatibility: inverted data produced with a version prior to 5.3.4.0 is read properly in Analyse. The new inverted files are readable by Vista but not by older versions of Analyse or Tools.

In most tests we have seen 85-95% decrease of the inverted database size and no noticeable deterioration in the reading speed. For example, in the last test, I reduced the size of a .dat folder from 4.2 GB to 45.0 MB. In order to convert from the old to the new format you can use the following steps:

Open the old format inverted data in Askia Analyse 5.3.4.0 version or higher.
File > Export a sub-population
Select the new target file > select ‘All interviews’ as the sub-population > check ‘Export as inverted database’. At this stage you can also select any calculated variables to export or that can be done after by copying / importing the variables from one file to another.

Nested Edges

Prior to Analyse 5.3.4.0, the variables you place in your edge would be line up side by side in your edge banner. So, for example, if I have three variables in my edge with 2, 3 and 4 categories then I would have a single edge banner of 9 categories.

From 5.3.4.0 onwards you have the option to change this so that your edges build upwards and you can have two or more variables (edge banners) layered on top of each other. The result from the example above is that you have 2 x 3 x 4 = 24 crossed categories created.

The benefit of Nested edges is that they allow you to have many crossed variables / dimensions in your tables. Previously you would be limited to three before you had to start thinking about using scripts, crossed calculated variables or sub-populations.

Without setting:

With setting: – order of nested edges depends on order of variables in Edge

You can keep layering them up e.g. if we add another variable, ‘Agreement’, into our edges:

We get the following:

It’s worth noting that you cannot have a hybrid situation where you have some edges layered and some not. You will have to manage this using your columns as well as your edges.

Example of usage

Sig Test Options Stored In Variable

Up to now, it was only possible to store col sig options in the tab template. Therefore, the same tab template would need to be replicated for small tweaks in the testing. The purpose of this new feature is to enable more flexibility by storing the col sig options in the column profile and by doing so reduces the number of tab templates required.

The closest relation we have to this new feature in versions prior to 5.3.4.0 is ‘Specify columns’.

The only difference with the latest option is that the Test cols test is not entered into the advanced options (above left) but instead the column profile by right clicking in the columns and selecting ‘Col-sig testing’ (below).

Since these options can be stored in the columns, they will also be saved in a profile or a portfolio.

Example of usage

Excel Export Options

Prior to Analyse 5.3.4.0, we had the set of options (A) below. Now since 5.3.4.0 we have set of options (B) which allow more flexibility in organising your Excel tabulations.

The following is a description of what these additional options do:

New sheet for each portfolio chapter

When you are in the portfolio view you can right-click > Insert a chapter. This has the use of keeping large portfolios tidy by organising the tab definitions into sections. With the ‘New sheet for each portfolio chapter’ option set you can now export each of these chapter sections to its own sheet in Excel. So the below will give you 2 sheets. Sheet 1 has 16 separate tables and sheet 2 has 10 separate tables.

New sheet for each tab definition

A tab definition is created in a portfolio every time you press the yellow folder with green arrow icon in the view below:

In the example below we have created 3 tab definitions. The shown settings will output 3 sheets in Excel. Sheet 1 has 8 separate tables, sheet 2 has 8 separate tables and sheet 3 has 10.

New sheet for each edge

We have 5 variables in the rows and four responses in the edge. Our settings dictate that we therefore have 5 x 4 tables.

With the old settings we could select:

‘One tab per sheet’ – Yes, which would give us 20 sheets in our Excel output or

‘One tab per sheet’ – No, which would give us 1 sheet in our Excel output.

Now with the following settings we can create an Excel output which has 4 sheets, one for each edge response. Each of the 4 sheets contains 5 separate tables, one for each variable in the rows.

It’s worth noting here if you were to also set ‘One tab per sheet’ – Yes, and rerun your tables, the output would be 20 sheets. This setting overrides the others. Furthermore, if you use the three settings in conjunction with each other, the setting which produces the greater number of tables will take priority.

Weighting Efficiency

The weighting efficiency is a measure of how much skewing we have had to do in order to get the weights to converge – the result is a percentage. The closer it is to 100% the less skewing.

In 5.3.4.0. onwards, when you run a weighted set of tables, a file called “Report of weights.txt” is created in the .dat folder that belongs to the .qes file you have opened in Analyse. If the .dat folder doesn’t already exist, it will be created.

An explanation of the weighting efficiency formula can be found here.

Reviewing the attached WeightingEfficiency.xlsx file should also help to understand how we arrive at the weighting efficiency figure.

This looks at the 24 interviews we have in the Data.qes and for each interview lists the weighting factors in column E. In the range L2:U36 it works out, step by step, the numbers required for eventually arriving at the weighting efficiency of % 80.05722937.

In WeightingEfficiency.rar there is a .qes and weighted portfolio. If you open these and run the results, you should see the following in the Report of weights.txt created in the .dat folder:

The weighting info on the first line relates to the weighting options you have set:

-Min weight- -Max weight- -Number of possible iterations- -Accuracy-

The maximum weight displayed in this file is different to what you’ll see in the weighting options because the base calculated by summing the weighting factors (24.82096438) is different to the unweighted base (24).

If you interview 100 people, and you set your base to be 10,000,000, a “normal” interview would have a weight of 10,000

If you set the maximum weight to 5, it means the algorithm will ensure the weight will not allocate a weight 5 times more than a normal interview. In this particular example, it means it will keep all interviews’ weights under 50,000.