GIS

LATEST GIS CONFERENCE IN 2009

Click here—->GIS Conference

ASIA GIS 2008 CONFERENCE

I wish i can go and present a paper here. But… dekat2 nak Syawal ma… Anyway, it is so exciting if i can reach my foot here. Now i’m writing one paper for submission before this 25th July. Hehe… :)

ArcGIS9.2

My work concentrations are in GIS and Remote Sensing(Involving Satellite Image). But now, i want to talk about GIS. i’M Still a totally layman but i’m a fast learner. oH REALLY? :) GIS = Geographical Information System. In English,we call it integrating data and graphic (esp map).

I’m using ArcGIS software for my processing. Frankly speaking, this sftware will turn u into a turtle if u not full fill the space needed in ur hardisk for this software. Campak je dalam tong sampah if ur pc so slow.Haha.

Just leave a comment if u want this software. I have the ‘CRACK’ one.

God Bless!!!

A Story About GIS-INTRO

What is Shapefile?

The ESRI Shapefile or simply a shapefile is a popular geospatial vector data format for geographic information systems software. It is developed and regulated by ESRI as a (mostly) open specification for data interoperability among ESRI and other software products.[1] A “shapefile” commonly refers to a collection of files with “.shp“, “.shx“, “.dbf“, and other extensions on a common prefix name (e.g., “lakes.*“). The actual shapefile relates specifically to files with the “.shp” extension, however this file alone is incomplete for distribution, as the other supporting files are required.

Shapefiles spatially describe geometries: points, polylines, and polygons. These, for example, could represent water wells, rivers, and lakes, respectively. Each item may also have attributes that describe the items, such as the name or temperature.

A shapefile is a digital vector storage format for storing geometric location and associated attribute information. This format lacks the capacity to store topological information. The shapefile format was introduced with ArcView GIS version 2 in the beginning of the 1990s. It is now possible to read and write shapefiles using a variety of free and non-free programs.

Shapefiles are simple because they store primitive geometrical data types of points, lines, and polygons. These primitives are of limited use without any attributes to specify what they represent. Therefore, a table of records will store properties/attributes for each primitive shape in the shapefile. Shapes (points/lines/polygons) together with data attributes can create infinitely many representations about geographical data. Representation provides the ability for powerful and accurate computations.

While the term “shapefile” is quite common, a “shapefile” is actually a set of several files. Three individual files are normally mandatory to store the core data that comprises a shapefile. There are a further eight optional files which store primarily index data to improve performance. Each individual file should conform to the MS DOS 8.3 filenameing convention (8 character filename prefix, fullstop, 3 character filename suffix such as shapefil.shp) in order to be compatible with past applications that handle shapefiles. For this same reason, all files should be located in the same folder.

Mandatory files :

  • .shp — shape format; the feature geometry itself
  • .shx — shape index format; a positional index of the feature geometry to allow seeking forwards and backwards quickly
  • .dbf — attribute format; columnar attributes for each shape, in dBase III format

Optional files :

  • .prj — projection format; the coordinate system and projection information, a plain text file describing the projection using well-known text format
  • .sbn and .sbx — a spatial index of the features
  • .fbn and .fbx — a spatial index of the features for shapefiles that are read-only
  • .ain and .aih — an attribute index of the active fields in a table or a theme’s attribute table
  • .ixs — a geocoding index for read-write shapefiles
  • .mxs — a geocoding index for read-write shapefiles (ODB format)
  • .atx — an attribute index for the .dbf file in the form of shapefile.columnname.atx (ArcGIS 8 and later)
  • .shp.xml — metadata in XML format

In each of the .shp, .shx, and .dbf files, the shapes in each file correspond to each other in sequence. That is, the first record in the .shp file corresponds to the first record in the .shx and .dbf files, and so on. The .shp and .shx files have various fields with different endianness, so as an implementor of the file formats you must be very careful to respect the endianness of each field and treat it properly.

Shapefiles deal with coordinates in terms of X and Y, although they are often storing longitude and latitude, respectively. While working with the X and Y terms, be sure to respect the order of the terms (longitude is stored in X, latitude in Y).

Shapefile shape format (.shp)

The main file (.shp) contains the primary geographic reference data in the shapefile. The file consists of a single fixed length header followed by one or more variable length records. Each of the variable length records includes a record header component and a record contents component. A detailed description of the file format is given in the ESRI Shapefile Technical Description.[1] This format should not be confused with the AutoCAD shape font source format, which shares the .shp extension.

The main file header is fixed at 100 bytes in length and contains 17 fields; nine 4-byte (32-bit unsigned integer or uint32) integer fields followed by eight 8-byte (double) floating point fields:

Bytes Type Endianness Usage
0-3 uint32 big File code (always hex value 0×0000270a)
4-23 uint32 big Unused; five uint32
24-27 uint32 big File length (in 16-bit words)
28-31 uint32 little Version
32-35 uint32 little Shape type (see reference below)
36-67 double little Minimum bounding rectangle (MBR) of all shapes contained within the shapefile; four doubles in the following order: min X, min Y, max X, max Y
68-83 double little Range of Z; two doubles in the following order: min Z, max Z
84-99 double little Range of M; two doubles in the following order: min M, max M

The file then contains any number of variable-length records. Each record is prefixed with a record-header of 8 bytes:

Bytes Type Endianness Usage
0-3 uint32 big Record number
4-7 uint32 big Record length (in 16-bit words)

Following the record header is the actual record:

Bytes Type Endianness Usage
0-3 uint32 little Shape type (see reference below)
4- - - Shape content

The variable length record contents depend on the shape type. The following are the possible shape types:

Value Shape type Fields
0 Null shape None
1 Point X, Y
3 Polyline MBR, Number of parts, Number of points, Parts, Points
5 Polygon MBR, Number of parts, Number of points, Parts, Points
8 MultiPoint MBR, Number of points, Points
11 PointZ X, Y, Z, M
13 PolylineZ Mandatory: MBR, Number of parts, Number of points, Parts, Points, Z range, Z array
Optional: M range, M array
15 PolygonZ Mandatory: MBR, Number of parts, Number of points, Parts, Points, Z range, Z array
Optional: M range, M array
18 MultiPointZ Mandatory: MBR, Number of points, Points, Z range, Z array
Optional: M range, M array
21 PointM X, Y, M
23 PolylineM Mandatory: MBR, Number of parts, Number of points, Parts, Points
Optional: M range, M array
25 PolygonM Mandatory: MBR, Number of parts, Number of points, Parts, Points
Optional: M range, M array
28 MultiPointM Mandatory: MBR, Number of points, Points
Optional Fields: M range, M array
31 MultiPatch Mandatory: MBR, Number of parts, Number of points, Parts, Part types, Points, Z range, Z array
Optional: M range, M array

In common use, shapefiles containing Point, Polyline, and Polygon are extremely popular. The “Z” types are three-dimensional. The “M” types contain a user-defined measurement which coincides with the point being referenced. Three-dimensional shapefiles are rather uncommon, and the measurement functionality has been largely superseded by more robust databases used in conjunction with the shapefile data.

Shapefile shape index format (.shx)

The shapefile index contains the same 100-byte header as the .shp file, followed by any number of 8-byte fixed-length records which consist of the following two fields:

Bytes Type Endianness Usage
0-3 uint32 big Record offset (in 16-bit words)
4-7 uint32 big Record length (in 16-bit words)

Using this index, it is possible to seek backwards in the shapefile by seeking backwards first in the shape index (which is possible because it uses fixed-length records), reading the record offset, and using that to seek to the correct position in the .shp file. It is also possible to seek forwards an arbitrary number of records by using the same method.

Shapefile attribute format (.dbf)

Attributes for each shape are stored in the xBase (dBase) format, which has an open specification.

Shapefile projection format (.prj)

The projection information contained in the .prj file is critical in order to understand the data contained in the .shp file correctly. Although it is technically optional, it is most often provided, as it is not necessarily possible to guess the projection of any given points. Some typical information contained in the .prj file is:

Shapefile spatial index format (.sbn)

This is a binary spatial index file, which is used only by ESRI software. The format is not documented, and is not implemented by other vendors. The .sbn file is not strictly necessary, since the .shp file contains all of the information necessary to successfully parse the spatial data.

Limitations

Topology and shapefiles

Shapefiles do not have the ability to store topological information. ArcInfo coverages and Personal/File/Enterprise Geodatabases do have the ability to store feature topology.

Spatial representation

The edges of a polyline or polygon are defined using points, which can give it a jagged edge at higher resolutions. Additional points are required to give smooth shapes, which requires storing quite a lot of data compared to, for example, bézier curves, which can capture complexity using smooth curves, without using as many points. Currently, none of the shapefile types support bézier curves.

Data storage

Unlike most databases, the database format is based on older xBASE standard, incapable of storing null values in its fields. This limitation can make the storage of data in the attributes less flexible. In ArcGIS products, values that should be null are instead replaced with a 0 (without warning), which can make the data misleading. This problem is addressed in ArcGIS products by using ESRI’s Personal Geodatabase offerings, one of which is based on Microsoft Access.

Mixing shape types

Each shape file can technically store a mix of different shape types, as the shape type precedes each record, but common use of the specification dictates that only shapes of a single type can be in a single file. For example, a shape file cannot contain both Polyline and Polygon data. Thus, well (point), river (polyline) and lake (polygon) data must be kept in three separate files.

GIS Vector Data (Mandatory to understand)

Point

Each point is stored by its location (X, Y) together with the table attribute of this point.

For example, 4 points below has their coordinate location in (X, Y) and each point has attributes of deep and amount of water contamination.

Line

Each line is stored by the sequence of first and last point together with the associated table attribute of this line. For example, three lines below (a, b and c) have their first and last node to distinguish their location and each line has attributes of flow and capacity of the sewerage pipe. Notice that each node has coordinate (X, Y) that is stored in another table.

Because the first and end node coordinates of each line is known, the length of a line or poly-line (sequence of lines) can be easily computed.

In most GIS software nowadays, you can use Arc instead of line. The representation is actually the same with additional tangent and length of tangent of the two ends.

Polygon

Polygon is represented by a closed sequence of lines. Unlike line or poly-line (sequence of line), polygon always closed. That is, the first point is equal to the last point. A polygon can be represented by a sequence of nodes where the last node is equal to the first node. For example, polygon A below has its first and last node in node number 1 to settle its location. Aside from location attributes, the polygon has associated attributes of area and bacterial population. Notice that each node has coordinate (X, Y) that is stored in another table.

Using polygon, several geometric attributes such as area and perimeters can be derived easily.

What things do we represent as Point, Line or Polygon?

Data representation is depending on

  1. Map scale, and
  2. Functions you want to perform in your later analysis.

What things represented as line (or poly-line) may be easy to guess: road, pipeline, water line, rivers, bus route and so on that have basic shape similar to line or combination of lines.

What things we represent as a point or polygon? In the city map scale around 1:25,000 or 1:10,000; you may represent buildings, post offices, bus stops, hospitals, police stations, wells, and so on as points. If you need more detail map, however, say in the scale of 1:1000, those infrastructure listed above may be better to be represented as polygons, rather than point.

Point is simpler to input and analyzed. Polygon need more points to input but you may get the area, perimeter and other geometric attributes computed by the GIS software rather than you input manually. If you are sure that you do not need these geometrical attributes in your later analysis, input your data as point rather than polygon.

Responses

  1. the simple concept about GIS. the more u into it, more love and passion for it. GIS is not just a tool. It’s a part of my soul inside me.

  2. After reading the article, I just feel that I need more info. Could you suggest some more resources ?

  3. Would it be possible for you to post a sample of a shapefile generated in this format? Thanks.

  4. yes. can u give me ur email?i’ll send through ur email add.

  5. mhornish@gmail.com

  6. thanks

  7. i think u must try Arcobject.. more fun..

  8. yup GIS is not just a tool but more on it..like powerpoint, photoshop.suma ble buat kat GIS tapi kne rajin xplore ar…kan sifu (“,)

  9. mis@: mane de sifu. bese2 je

  10. klu saya nak minta journel atau paper gis yang awak telah terbitkan,ble?

  11. evold: yang dah terbitkan belum lagi. taklah terer…tapi adela paper sikit…yang belum hantar conference pun. banyak nak kene polish…banyak kelemahan lagi…heheh…saya pun still blaja2 lagi

  12. slm…ello kak dila,,,hehe…besh ek klo akak dpt pegi busan tu!!so, paperwork tu dh selamat ek?waa…hebat la kak!!tabik spring tuk kak dila!!!
    slmt maju jaya aku ucapkn…hee~~

  13. ita: paperwork dah selamat…besela…..tukang tulis2 je..yang pergi bentang dekan kita yang baru tu….heheheheh….


Leave a response

Your response: