About.

History

PoinProc is a project which is being established to lay a bridge between GIS and web mapping.

What is marker clustering

Web maps customization rely heavily on the use of what are called markers or pushpins, shapes which locate specific places. These shapes are drawn on top of a base map, and the purpose of these overlays is map customization. The set of shapes is called a vector layer. When the number of markers is high, the use of zoom/pan map controls can take the map to a somewhat useless state as a clout of jammed markers. The purpose of marker clustering is to maintain map quality when navigating across zoom levels by providing an appropriate display appearance.

What is PoinProc

PoinProc is a computational service, a process launcher, primarily for marker clustering. Each process takes a set of markers from a map vector layer, and transforms it into a more suitable zoom-dependant set of geometries. For low zoom levels, the process results in a set of polygons representing big clusters, the appearance being that of a thematic map. Clusters are given information about the markers it represents. For high zoom levels, the results contains less changes, clusters are smaller and the appearance is similar to that of the original marker map.

The process transform a table of marker data containing at least the geographic coordinates, into a .xml file which, together with a javascript file, are read by web maps APIs which actually display a map in a web page. Alternatively the results can be stored into a table text to be used by web based GIS services such as Google's Fusion Tables. You can obtain a customized zoom-dependant map without programming skills. But in order to attain more functionalities, you will need certain knowledge about these maps services APIs (Google, Bing, etc.).

What PoinProc is not

You are supposed to move your results files to your website, or to any storage of your convenience, after few weeks. PoinProc is not intended as a data storage.

PoinProc won't spread users data or results, but there's no guarantees of secrecy either. The primary purpose is internet mapping by means of javascript maps API.

Technical details

In order to have a process running and obtain some results that may be understood, your data should meet some requirements. Browse the sample datasets if you haven't done it already. Read more »

Requirements.

  1. You’re using a map viewer that provides an API engine, such as Google Maps or Bing Maps.
  2. You’re overlaying vector geometries, such as markers or lines, onto the viewer base maps.
  3. Someone is responsible for creating or managing the geographical content being added.
  4. This service is limited to a range form 100 to 10.000 points.

How it works.

Generalization distances don't refer to actual geographic distances, but to the distances as features are printed in a paper map, or are displayed in a screen. In this case, distances are measured in terms of screen pixels, at least until high density screens standarize. A value of 15-20 pixels is taken.

Information needed.

Input data consists of a table with at least two fields containing the marker's latitude and longitude. Each row should have each marker’s information. The database table must be exported to .csv or a similar text file format to be properly parsed by Poinproc service. The first row must contain each field's name.

longitude latitude continent country class
-56.0930 -15.6100 America Brazil 250k_500k
-47.8977 -15.7921 America Brazil 1M_5M
-49.2550 -16.7270 America Brazil 500k_1M
Database table example - cities

Once you upload your file (.csv, .tab or similar), if everything works out the website should recognise the name of the fields in the first line of your table. There are different field selectors which are dropdown form controls. You must pick at least latitude and longitude fields.

Options: summary field(s).

If your table contains additional data fields (to those of the geographical coordinates), you may pick one. The process will provide each marker cluster with a small summary of the embedded original marker's values.

Cluster info example - cities
Summary field Cluster marker’s summary information Explanation
(none) 343_points_included If no summary field is provided, the cluster marker's description is a message containing the count of markers it represents.
continent 343_points_included_in_3 areas._94_points_ included_ in_Asia. _229_points_included_in_Europe._20_points_ included_in_Africa. If a summary field is used, for instance, "continent", the description is a message containing the field values having one or more markers being represented, for example, "Asia" or "Europe", and the count of markers for each value.
country 103_points_included_in_13 areas._1_point_included_ in_ Suriname. _15_points_included_in_Brazil._9_points_ included_in_Chile. … …   _1_point_included_in_Paraguay._1_point_included_in _Uruguay.  12_points_included_in_Brazil.  Different summary fields can be chosen for different ranges of scales, i.e small scales and large scales, for example, "continent" and "country".

The results file contains the cluster markers, the standalone markers, their descriptions, and the other geometries (polygons and lines). Additionally, the service creates another results file which contains the lists of row indices or marker IDs related to each cluster. You can refer to the Users Guide on how to do a more advanced use of these marker IDs.

Options: separator field(s).

If no separator field is provided, markers that in the screen look close to each other are clustered together. If a separator field is used, for example "country", markers having different values, i.e. "France", "Germany", don't cluster together.

If there's a lot of markers and they are scattered enough, the clusters shall extend around polygons too large. This is a usual case, so you should take care that your data has fields with geographical meaning, such as "country", from the beginning. The display appearance is that of a thematic map.

If clusters look too close to each other in the screen, they are clustered regardless of the separator field's values. That leads to better results when markers are located near the boundaries. For example, when the markers are in the vicinity of mountain ranges, with are natural boundaries.

Options: combining summary and separator field(s).

To achieve the better results, the separator field should typically refer to larger geographical entities than those referred by the summary field - for each zoom level. Results are good enough if the same field is used, both as a separator and as a summary field. For the case of the data samples, the pick depended upon the availability of a first order administrative division.

Summary and separator fields - examples
Available fields Summary field Summary field zoom range Separator field Separator field zoom range
continent, country, state country 0 - 3 continent 0 - 3
state 4 - 24 country 4 - 24
continent, country country 0 - 24 continent 0 - 3
country 4 - 24

Options: name field.

If a name field is provided (the marker's name), each marker that remains alone (unclustered) is given a description which is its name. If no name field is provided, each marker that remains alone (unclustered) is given a description which is the value of the summary field if it was provided. Otherwise it is the marker row's number.

About administrative boundaries - entities fields.

As you've seen, most of the sample datasets rely on entities fields such as continent, country and others. Usually you will have to check the data has entities fields to achieve good results, or create them by yourself. The professional app includes reverse geocoding commands to create the fields from the geographical coordinates, but not this website.

Important: prepare your data.

You can use software such as a worksheet to enter, at least, latitude and longitude values for each marker. Coordinate values are supposed to be in the geographical coordinate system (between -180 and 180). Then you save the worksheet in CSV format, which is a text file format.

If you get the data from a third party source, you probably will have to correct errors such as:

  1. Null values.
  2. Sign changed values for longitude or/and latitude.
  3. No decimals or no decimal point.
  4. Plain inaccurate values for coordinates.
For text fields you should take care of:
  1. Typos.
  2. Entities names written in more than one language -bilingual countries-.
  3. Irregular use of abreviations or initialisms.
  4. Plain missing name values.

But don't worry, you can use PoinProc precisely as a tool to find these kind of conflicts. You can then fix them and submit the data again to eventually attain appropriate results.

Download the User Guide for additional information.

About the author

If you have questions about PoinProc, you can write to Xavier; you can find the e-mail address at gnzlz.es or at the bottom of this page.

Go Back