{
Responsible for this page:
Mikael Jern
Last updated: 2009-10-11
LiU - ncva - Research

The choropleth map component facilitates a novel 100% Web-enabled implementation of a layered architecture for simultaneous transparent views of multiple map layers. It means that each class of spatial information is represented by its own layer, e.g. glyph, shaded map and Google map. These layers can then be combined and controlled to be displayed, hidden or transparent depending on the needs of the user.

Figure: GAV Flash Map Layer Component with four map layers pie glyphs, shaded map, country and Google map
You can test this Demonstrator at:
http://vitagate.itn.liu.se/GAV/eXplorer/OECDRegional/
eXplorer helps the analyst see patterns of events, relationships, and interactions over time within a geospatial context. The space-time-attribute data cube is used to conceptually explain the methodology in Explorer’s data handling. The data cube has three dimensions: geography (OECD regions), time (years), and attributes (indicators such as age groups, education etc). Each cell in the cube is defined by a specific spatial object (Liguria Italy), a specific time (year 1999), and a selected indicator (age group 65+). The value at that cell is the indicator value (Liguria, 1999, age 65+). Each region is represented by a horizontal slice in the data cube. Selected regions are represented as “time profiles” in the Time Graph above for a given indicator.

Figure: All five views are dynamically time-linked and updated with immediate smooth results. A region in Italy with high elderly population (65+) Liguria are highlighted in all views. See also web site for demonstration:
http://vitagate.itn.liu.se/GAV/eXplorer/OECDRegional/
This research demonstrates and reflects upon the potential synergy between the use of a squarified treemap dynamically linked to a choropleth map to facilitate visualization of social science regional data at several hierarchical levels aiming to discover statistical patterns that relate to significant characteristics of regions under study.

Figure: A squarified treemap ordered by population size and NUTS1 regions, dynamically linked to a choropleth map, both coloured by population for age group 65+, and applied on a limited (Italy) OECD regional hierarchical dataset. The hierarchical data structure is based on 5 levels continent, country, NUTS1, NUTS2 and NUTS3.
We are interested in finding answers about what methods and tasks are important when exploring demographical hierarchical data, such as general overview, trends over time, geographical patterns, indicator correlation, outliers and simultaneously mapping two dependent indicators such as age group and total population. For example, for a choropleth map screen space is always allocated depending on geographical area rather than an indicator of interest? Can the treemap compensate this weakness and are its known strengths and weaknesses applicable to the demographics data domain?

Figure: Linked treemap and choropleth map showing the ratio of children in the European OECD member countries. The colour of each region represents the percentage of the total population that falls within the 0-14 age group. Size in the Treemap shows the size of the total population.
Red areas have a high ratio of children and blue areas have a low ratio. Cell size in the treemap represents the total number of people in each region.
Four extreme clusters of regions immediately stand out in the choropleth map. South-eastern Turkey is one of them, with children in some parts making up almost half the population. On the opposite end of the distribution, with ratios down to nine percent, lie former East Germany, north-eastern Spain and northern Italy.
The additional information provided by the treemap that the choropleth map lacks is all based on the introduction of a second indicator; in this case the size of each region’s population. For instance, it becomes apparent that despite that Sweden is the third largest of the included countries measured in physical area, it only has a population of nine million – less than Turkey’s Istanbul region alone. It is also possible to see that while Germany and Turkey currently have almost identical total population numbers, Turkey has far more children. Over the next few decades Turkey is therefore likely to overtake Germany as the European OECD member country with the highest population.
Hi-res image: Ratio of 0-14 yrs, Map + Treemap
Hi-res image: Population change in oecd countries
This research introduces a Visual Analytics framework for the support of collaborative explorative data analysis (EDA) based on task-relevant visualization components embedded in HTML documents. The goal of our research is to let the analyst visually explore and search the answers to various questions about data and simultaneously capture and save important discoveries and thus enable collaboration and sharing of gained insights and knowledge to remotely dispersed team members over the Internet. A team will benefit from an interactive collaborative instrument that can “coach” them in the understanding and testing of hypotheses leading to faster understanding and a higher confidence level in the visual information. The foundation for this approach is based on the publicly available GeoAnalytics visualization framework and class library (GAV). For more information and demo: SmartDoc

Figure: A split HTML document with an embedded Molecule Map application (right side) and (left) descriptive educational text with blue highlighted snapshots. High peaks (green) represent high density of molecules. Red glyphs represent a cluster of molecules that evolve when the user zoom in - the closer the molecules are the more similarity they represent. Overview 3D landscape map (top) with rectangular requested focus view Focus 2D view (low) with marked cluster – the names and SMILES of molecules that belong to this cluster is given.
The GAV framework and class library is the foundation for our Visual Analytics research agenda. GAV is designed with the intention to significantly shorten the time and effort needed to develop state-of-the-art VA and GeoAnalytics applications. Through an atomic layered component architecture containing several hundred C# classes, GAV offers a wide range of visual representations (from the simple scatter plot to volume visualization. A component also incorporates versatile interaction methods drawn from many data visualization research areas. We now extend previous research work by introducing new means for a developer to extend and further customize some of the popular functional components by breaking them into lower-level “atomic” components.

Figure: GAV component level architecture. Atomic layered PC components (lines and axes, inquiry and filter, dynamic labels, background, focus & context, histogram, percentiles etc) are the building blocks for a functional higher-level and more advanced PC component with integrated statistics analysis. The enhanced PC component is in the next stage assembled together with scatter plot and scatter matrix components by an application developer into a multiple-linked application for multivariate data visualization.
Complex official statistical data that contain geographic locations, time series and multivariate attributes are publicly available from National Statistics Institutes such as our partner Statistics Sweden, SCB or EUROSTAT. This data can be used for the purpose of making policy decisions, and to facilitate the appreciation of economic, social, demographic, environmental and other matters of interest to the governments, government departments, local authorities, businesses, and to the general public. Moreover, survey data on specific topics, such as labour force, household budget, are regularly collected to keep updated information on some economic and social phenomena. The techniques for attaching socio-economic data to specific locations have markedly improved over the last years. For this case study, we have selected the environmental domain and the supply and use of multivariate energy data for controlling the emission of carbon dioxide among 290 Swedish municipality regions during 1990-2005.


Figure: Above is the space-time-attribute 3D data model and corresponding Excel data set. Below is the GeoWizard Time application – Search for space-time-attribute patterns – See the whole through 4 complementary and coordinated regions.

Data sets containing a combination of categorical and continuous variables (mixed data sets) are difficult to analyse since no generalized similarity measure exists for categorical variables. Quantification of categorical variables makes it possible to represent and analyse this type of data using techniques designed for numerical data.
Within this research project a quantification process of categorical data in mixed data sets has been developed. The research aims to utilize the efficiency of statistical data analysis as well as making use of the domain knowledge of an expert user. The quantification process uses statistical analysis to find relationships within the categorical variables. Information on relationships among the continuous variables is incorporated into the process using clustering methods. The process is carried out in an interactive environment using parallel coordinates as a visual interface. Within this environment the user is able to control the quantification process, analyse the result and modify it according to his or her knowledge.
For more information contact Sara Johansson