当前位置：文档库 › GDAL库

GDAL库

GDAL/ORG库

GDAL(Geospatial Data Abstraction Library)是一个在X/MIT许可协议下的开源栅格空间数据转换库。它利用抽象数据模型来表达所支持的各种文件格式。它还有一系列命令行工具来进行数据转换和处理。

OGR是GDAL项目的一个分支，功能与GDAL类似，只不过它提供对矢量数据的支持。

有很多著名的GIS类产品都使用了GDAL/OGR库，包括ESRI的ArgGIS 9.2，Google Earth和跨平台的GRASS GIS系统。

利用GDAL/OGR库，可以使基于Linux的地理空间数据管理系统提供对矢量和栅格文件数据的支持。

1 . GDAL

GDAL提供对多种栅格数据的支持，包括Arc/Info ASCII Grid(asc)，GeoTiff (tiff)，Erdas Imagine Images(img)，ASCII DEM(dem) 等格式。

GDAL使用抽象数据模型(abstract datamodel)来解析它所支持的数据格式，抽象数据模型包括数据集(dataset)，坐标系统，仿射地理坐标转换(Affine GeoTransform)，大地控制点(GCPs)，元数据(Metadata)，栅格波段(Raster Band)，颜色表(ColorTable)，子数据集域(Subdatasets Domain)，图像结构域(Image_StructureDomain)，XML域(XML:Domains)。

GDALMajorObject类：带有元数据的对象。

GDALDdataset类：通常是从一个栅格文件中提取的相关联的栅格波段集合和这些波段的元数据；GDALDdataset也负责所有栅格波段的地理坐标转换(georeferencing transform)和坐标系定义。

GDALDriver类：文件格式驱动类，GDAL会为每一个所支持的文件格式创建一个该类的实体，来管理该文件格式。

GDALDriverManager类：文件格式驱动管理类，用来管理GDALDriver类。

2. OGR

OGR提供对矢量数据格式的读写支持，它所支持的文件格式包括：ESRI Shapefiles，S-57，SDTS，PostGIS，Oracle Spatial，Mapinfo mid/mif ，Mapinfo TAB。

1)OGR体系结构

OGR包括如下几部分：

Geometry：类Geometry(包括OGRGeometry等类)封装了OpenGIS的矢量数据模型，并提供了一些几何操作，WKB(Well KnowsBinary)和WKT(Well Known Text)格式

之间的相互转换，以及空间参考系统(投影)。

Spatial Reference：类OGRSpatialReference封装了投影和基准面的定义。

Feature：类OGRFeature封装了一个完整feature的定义，一个完整的feature包括一个geometry和geometry的一系列属性。

Feature Definition：类OGRFeatureDefn里面封装了feature的属性，类型、名称及其默认的空间参考系统等。一个OGRFeatureDefn对象通常与一个层(layer)对应。

Layer：类OGRLayer是一个抽象基类，表示数据源类OGRDataSource里面的一层要素(feature)。

Data Source：类OGRDataSource是一个抽象基类，表示含有OGRLayer对象的一个文件或一个数据库。

Drivers：类OGRSFDriver对应于每一个所支持的矢量文件格式。类OGRSFDriver 由类OGRSFDriverRegistrar来注册和管理。

图-OGR的Geometry模型关系图

图-OpenGIS的简单要素数据模型

由上面两图的对比，可以清楚的看到，OGR的Geometry模型是严格遵循OpenGIS的简单要素数据规范的。OGR的Geometry模型不仅在继承体系上与OpenGIS的简单要素数据模型一致，在函数接口上也向其靠拢，从基本的获取Geometry对象信息的方法如Dimension ( )、GeometryType ( )、SRID ( )、Envelope( )、AsText( )、Boundary( )等到判定空间未知关系的方法如Equals(anotherGeometry:Geometry)、Disjoint(anotherGeometry:Geometry)、Intersects(anotherGeometry:Geometry)、Touches(anotherGeometry:Geometry)等都是符合其标准的。

3）OGR的API使用范例：以下C++示例代码展示使用OGR的API来读取矢量数据。

#include "ogrsf_frmts.h"

int main()

{

//注册所有的文件格式驱动

OGRRegisterAll();

//打开point.shp文件

OGRDataSource *poDS = OGRSFDriverRegistrar::Open( "point.shp", FALSE );

//获取点层

OGRLayer *poLayer = poDS->GetLayerByName( "point" );

OGRFeature *poFeature;

//重置该层，确保从层的开始读取数据

poLayer->ResetReading();

while ( (poFeature = poLayer->GetNextFeature()) != NULL )

{

//获取该层的属性信息

OGRFeatureDefn *poFDefn = poLayer->GetLayerDefn();

int iField;

for ( iField = 0; iField < poFDefn->GetFieldCount(); iField++ ) { //获取某一个字段的信息

OGRFieldDefn *poFieldDefn = poFDefn->GetFieldDefn( iField );

if( poFieldDefn->GetType() == OFTInteger )

printf( "%d,", poFeature->GetFieldAsInteger( iField ) );

}

OGRGeometry *poGeometry;

//获取feature

poGeometry = poFeature->GetGeometryRef();

//用wkbFlatten宏把wkbPoint25D类型转换为wkbPoint类型

if( poGeometry != NULL

&& wkbFlatten(poGeometry->getGeometryType()) == wkbPoint ) {

OGRPoint *poPoint = (OGRPoint *) poGeometry;

printf( "%.3f,%3.f\n", poPoint->getX(), poPoint->getY() );

}

else

printf( "no point geometry\n" );

//销毁feature

OGRFeature::DestroyFeature( poFeature );

}

//销毁数据源，以便关闭矢量文件

OGRDataSource::DestroyDataSource( poDS );

}

2 GDAL Raster Formats

system/file system capabilities as well. Look here for details.

2ERDAS Imagine has different file format for large files, where 32-bit pointers cannot be used. Look for details here.

$Id: formats_list.html 20710 2010-09-29 18:10:19Z rouault $

3 GDAL Data Model

This document attempts to describe the GDAL data model. That is the types of information that a GDAL data store can contain, and their semantics.

Dataset

A dataset (represented by the GDALDataset class) is an assembly of related raster bands and some information common to them all. In particular the dataset has a concept of the raster size (in pixels and lines) that applies to all the bands. The dataset is also responsible for the georeferencing transform and coordinate system definition of all bands. The dataset itself can also have associated metadata, a list of name/value pairs in string form.

Note that the GDAL dataset, and raster band data model is loosely based on the OpenGIS Grid Coverages specification.

Coordinate System

Dataset coordinate systems are represented as OpenGIS Well Known Text strings. This can contain:

?An overall coordinate system name.

? A geographic coordinate system name.

? A datum identifier.

?An ellipsoid name, semi-major axis, and inverse flattening.

? A prime meridian name and offset from Greenwich.

? A projection method type (ie. Transverse Mercator).

? A list of projection parameters (ie. central_meridian).

? A units name, and conversion factor to meters or radians.

?Names and ordering for the axes.

?Codes for most of the above in terms of predefined coordinate systems from authorities such as EPSG.

For more information on OpenGIS WKT coordinate system definitions, and mechanisms to manipulate them, refer to the osr_tutorial document and/or the OGRSpatialReference class documentation.

The coordinate system returned by GDALDataset::GetProjectionRef() describes the georeferenced coordinates implied by the affine georeferencing transform returned by GDALDataset::GetGeoTransform(). The coordinate system returned by GDALDataset::GetGCPProjection() describes the georeferenced coordinates of the GCPs returned by GDALDataset::GetGCPs().

Note that a returned coordinate system strings of "" indicates nothing is known about the georeferencing coordinate system.

Affine GeoTransform

GDAL datasets have two ways of describing the relationship between raster positions (in pixel/line coordinates) and georeferenced coordinates. The first, and most commonly used is the affine transform (the other is GCPs).

The affine transform consists of six coefficients returned by

GDALDataset::GetGeoTransform() which map pixel/line coordinates into georeferenced space using the following relationship:

Xgeo = GT(0) + Xpixel*GT(1) + Yline*GT(2)

Ygeo = GT(3) + Xpixel*GT(4) + Yline*GT(5)

In case of north up images, the GT(2) and GT(4) coefficients are zero, and the GT(1) is pixel width, and GT(5) is pixel height. The (GT(0),GT(3)) position is the top left corner of the top left pixel of the raster.

Note that the pixel/line coordinates in the above are from (0.0,0.0) at the top left corner of the top left pixel to (width_in_pixels,height_in_pixels) at the bottom right corner of the bottom right pixel. The pixel/line location of the center of the top left pixel would therefore be (0.5,0.5).

GCPs

A dataset can have a set of control points relating one or more positions on the raster to georeferenced coordinates. All GCPs share a georeferencing coordinate system (returned by GDALDataset::GetGCPProjection()). Each GCP (represented as the GDAL_GCP class) contains the following:

typedef struct

{

char *pszId;

char *pszInfo;

double dfGCPPixel;

double dfGCPLine;

double dfGCPX;

double dfGCPY;

double dfGCPZ;

} GDAL_GCP;

The pszId string is intended to be a unique (and often, but not always numerical) identifier for the GCP within the set of GCPs on this dataset. The pszInfo is usually an empty string, but can contain any user defined text associated with

the GCP. Potentially this can also contain machine parsable information on GCP status though that isn't done at this time.

The (Pixel,Line) position is the GCP location on the raster. The (X,Y,Z) position is the associated georeferenced location with the Z often being zero.

The GDAL data model does not imply a transformation mechanism that must be generated from the GCPs ... this is left to the application. However 1st to 5th order polynomials are common.

Normally a dataset will contain either an affine geotransform, GCPs or neither. It is uncommon to have both, and it is undefined which is authoritative.

Metadata

GDAL metadata is auxiliary format and application specific textual data kept as a list of name/value pairs. The names are required to be well behaved tokens (no spaces, or odd characters). The values can be of any length, and contain anything except an embedded null (ASCII zero).

The metadata handling system is not well tuned to handling very large bodies of metadata. Handling of more than 100K of metadata for a dataset is likely to lead to performance degradation.

Some formats will support generic (user defined) metadata, while other format drivers will map specific format fields to metadata names. For instance the TIFF driver returns a few information tags as metadata including the date/time field which is returned as:

TIFFTAG_DATETIME=1999:05:11 11:29:56

Metadata is split into named groups called domains, with the default domain having no name (NULL or ""). Some specific domains exist for special purposes. Note that currently there is no way to enumerate all the domains available for a given object, but applications can "test" for any domains they know how to interprete.

The following metadata items have well defined semantics in the default domain: AREA_OR_POINT: May be either "Area" (the default) or "Point". Indicates

whether a pixel value should be assumed to represent a sampling over the

region of the pixel or a point sample at the center of the pixel. This is not

intended to influence interpretation of georeferencing which remains area

oriented.

?NODATA_VALUES: The value is a list of space separated pixel values matching the number of bands in the dataset that can be collectively used to

identify pixels that are nodata in the dataset. With this style of nodata a pixel is considered nodata in all bands if and only if all bands match the corresponding value in the NODATA_VALUES tuple. This metadata is not widely honoured by GDAL drivers, algorithms or utilities at this time.

?MATRIX_REPRESENTATION: This value, used for Polarimetric SAR datasets, contains the matrix representation that this data is provided in. The following

are acceptable values:

o SCATTERING

o SYMMETRIZED_SCATTERING

o COVARIANCE

o SYMMETRIZED_COVARIANCE

o COHERENCY

o SYMMETRIZED_COHERENCY

o KENNAUGH

o SYMMETRIZED_KENNAUGH

?POLARMETRIC_INTERP: This metadata item is defined for Raster Bands for polarimetric SAR data. This indicates which entry in the specified matrix

representation of the data this band represents. For a dataset provided as a

scattering matrix, for example, acceptable values for this metadata item are

HH, HV, VH, VV. When the dataset is a covariance matrix, for example, this

metadata item will be one of Covariance_11, Covariance_22, Covariance_33,

Covariance_12, Covariance_13, Covariance_23 (since the matrix itself is a

hermitian matrix, that is all the data that is required to describe the matrix). SUBDATASETS Domain

The SUBDATASETS domain holds a list of child datasets. Normally this is used to provide pointers to a list of images stored within a single multi image file (such as HDF or NITF). For instance, an NITF with four images might have the following subdataset list.

SUBDATASET_1_NAME=NITF_IM:0:multi_1b.ntf

SUBDATASET_1_DESC=Image 1 of multi_1b.ntf

SUBDATASET_2_NAME=NITF_IM:1:multi_1b.ntf

SUBDATASET_2_DESC=Image 2 of multi_1b.ntf

SUBDATASET_3_NAME=NITF_IM:2:multi_1b.ntf

SUBDATASET_3_DESC=Image 3 of multi_1b.ntf

SUBDATASET_4_NAME=NITF_IM:3:multi_1b.ntf

SUBDATASET_4_DESC=Image 4 of multi_1b.ntf

SUBDATASET_5_NAME=NITF_IM:4:multi_1b.ntf

SUBDATASET_5_DESC=Image 5 of multi_1b.ntf

The value of the _NAME is the string that can be passed to GDALOpen() to access the file. The _DESC value is intended to be a more user friendly string that can be displayed to the user in a selector.

IMAGE_STRUCTURE Domain

Metadata in the default domain is intended to be related to the image, and not particularly related to the way the image is stored on disk. That is, it is suitable for copying with the dataset when it is copied to a new format. Some information of interest is closely tied to a particular file format and storage mechanism. In order to prevent this getting copied along with datasets it is placed in a special domain called IMAGE_STRUCTURE that should not normally be copied to new formats.

Currently the following items are defined by RFC 14as having specific semantics in the IMAGE_STRUCTURE domain.

?COMPRESSION: The compression type used for this dataset or band. There is no fixed catalog of compression type names, but where a given format

includes a COMPRESSION creation option, the same list of values should be used here as there.

?NBITS: The actual number of bits used for this band, or the bands of this dataset. Normally only present when the number of bits is non-standard for the datatype, such as when a 1 bit TIFF is represented through GDAL as

GDT_Byte.

?INTERLEAVE: This only applies on datasets, and the value should be one of PIXEL, LINE or BAND. It can be used as a data access hint.

?PIXELTYPE: This may appear on a GDT_Byte band (or the corresponding dataset) and have the value SIGNEDBYTE to indicate the unsigned byte

values between 128 and 255 should be interpreted as being values between

-128 and -1 for applications that recognise the SIGNEDBYTE type.

RPC Domain

The RPC metadata domain holds metadata describing the Rational Polynomial Coefficient geometry model for the image if present. This geometry model can be used to transform between pixel/line and georeferenced locations. The items defining the model are:

?ERR_BIAS: Error - Bias. The RMS bias error in meters per horizontal axis of all points in the image (-1.0 if unknown)

?ERR_RAND: Error - Random. RMS random error in meters per horizontal axis of each point in the image (-1.0 if unknown)

?LINE_OFF: Line Offset

?SAMP_OFF: Sample Offset

?LAT_OFF: Geodetic Latitude Offset

?LONG_OFF: Geodetic Longitude Offset

?HEIGHT_OFF: Geodetic Height Offset

?LINE_SCALE: Line Scale

?SAMP_SCALE: Sample Scale

?LAT_SCALE: Geodetic Latitude Scale

?LONG_SCALE: Geodetic Longitude Scale

?HEIGHT_SCALE: Geodetic Height Scale

?LINE_NUM_COEFF (1-20): Line Numerator Coefficients. Twenty coefficients for the polynomial in the Numerator of the rn equation. (space separated) ?LINE_DEN_COEFF (1-20): Line Denominator Coefficients. Twenty coefficients for the polynomial in the Denominator of the rn equation. (space

separated)

?SAMP_NUM_COEFF (1-20): Sample Numerator Coefficients. Twenty coefficients for the polynomial in the Numerator of the cn equation. (space

separated)

?SAMP_DEN_COEFF (1-20): Sample Denominator Coefficients. Twenty coefficients for the polynomial in the Denominator of the cn equation. (space

separated)

These fields are directly derived from the document prospective GeoTIFF RPC document (https://www.wendangku.net/doc/a89250976.html,/rpc_prop.html) which in turn is closely modelled on the NITF RPC00B definition.

xml: Domains

Any domain name prefixed with "xml:" is not normal name/value metadata. It is a single XML document stored in one big string.

Raster Band

A raster band is represented in GDAL with the GDALRasterBand class. It represents a single raster band/channel/layer. It does not necessarily represent a whole image. For instance, a 24bit RG

B image would normally be represented as a dataset with three bands, one for red, one for green and one for blue.

A raster band has the following properties:

? A width and height in pixels and lines. This is the same as that defined for the dataset, if this is a full resolution band.

? A datatype (GDALDataType). One of Byte, UInt16, Int16, UInt32, Int32, Float32, Float64, and the complex types CInt16, CInt32, CFloat32, and

CFloat64.

? A block size. This is a preferred (efficient) access chunk size. For tiled images this will be one tile. For scanline oriented images this will normally be one

scanline.

? A list of name/value pair metadata in the same format as the dataset, but of information that is potentially specific to this band.

?An optional description string.

?An optional single nodata pixel value (see also NODATA_VALUES metadata on the dataset for multi-band style nodata values).

?An optional nodata mask band marking pixels as nodata or in some cases transparency as discussed in RFC 15: Band Masks.

?An optional list of category names (effectively class names in a thematic image).

?An optional minimum and maximum value.

?An optional offset and scale for transforming raster values into meaning full values (ie translate height to meters)

?An optional raster unit name. For instance, this might indicate linear units for elevation data.

? A color interpretation for the band. This is one of:

o GCI_Undefined: the default, nothing is known.

o GCI_GrayIndex: this is an independent grayscale image

o GCI_PaletteIndex: this raster acts as an index into a color table

o GCI_RedBand: this raster is the red portion of an RGB or RGBA image

o GCI_GreenBand: this raster is the green portion of an RGB or RGBA image

o GCI_BlueBand: this raster is the blue portion of an RGB or RGBA image

o GCI_AlphaBand: this raster is the alpha portion of an RGBA image

o GCI_HueBand: this raster is the hue of an HLS image

o GCI_SaturationBand: this raster is the saturation of an HLS image

o GCI_LightnessBand: this raster is the hue of an HLS image

o GCI_CyanBand: this band is the cyan portion of a CMY or CMYK image

o GCI_MagentaBand: this band is the magenta portion of a CMY or CMYK image

o GCI_YellowBand: this band is the yellow portion of a CMY or CMYK image

o GCI_BlackBand: this band is the black portion of a CMYK image.

? A color table, described in more detail later.

?Knowledge of reduced resolution overviews (pyramids) if available.

Color Table

A color table consists of zero or more color entries described in C by the following structure:

typedef struct

{

/- gray, red, cyan or hue -/

short c1;

/- green, magenta, or lightness -/

short c2;

/- blue, yellow, or saturation -/

short c3;

/- alpha or blackband -/

short c4;

} GDALColorEntry;

The color table also has a palette interpretation value (GDALPaletteInterp) which is one of the following values, and indicates how the c1/c2/c3/c4 values of a color entry should be interpreted.

?GPI_Gray: Use c1 as grayscale value.

?GPI_RGB: Use c1 as red, c2 as green, c3 as blue and c4 as alpha.

?GPI_CMYK: Use c1 as cyan, c2 as magenta, c3 as yellow and c4 as black.

?GPI_HLS: Use c1 as hue, c2 as lightness, and c3 as saturation.

To associate a color with a raster pixel, the pixel value is used as a subscript into the color table. That means that the colors are always applied starting at zero and ascending. There is no provision for indicating a prescaling mechanism before looking up in the color table.

Overviews

A band may have zero or more overviews. Each overview is represented as a "free standing" GDALRasterBand. The size (in pixels and lines) of the overview will be different than the underlying raster, but the geographic region covered by overviews is the same as the full resolution band.

The overviews are used to display reduced resolution overviews more quickly than could be done by reading all the full resolution data and down sampling.

Bands also have a Has Arbitrary Overviews property which is TRUE if the raster can be read at any resolution efficiently but with no distinct overview levels. This applies to some FFT encoded images, or images pulled through gateways (like OGDI) where down sampling can be done efficiently at the remote point.

基于GDAL库的遥感图像处理软件的框架设计与开发

基于GDAL库及OpenGL的遥感图像处理类软件的框架设计方法研究王顺志（中国海洋大学信息科学与工程学院，山东青岛266100）摘要：本文介绍了GDAL库及OpenGL图形接口的功能及特点，以及这两者对于开发遥感软件的帮助和优势，在此基础上介绍了一种遥感图像处理软件框架的设计方法，使软件实现正确读取各类格式的遥感文件，进行图像处理操作并在窗口中绘图显示结果，为这类遥感软件的开发提供参考。关键词：GDAL OpenGL 分块读取应用程序框架类对象关系

1引言卫星遥感技术自上世纪八十年代起进入了一个高速发展的阶段，随着美国宇航局（NASA）、欧空局（ESA）以及其他一些国家，如加拿大、日本、中国先后建立起各自的遥感系统，为科研人员提供了越来越多有价值的从太空观测地球的数据和图像，因此，如何快捷、准确地处理遥感数据成为卫星遥感一个新的课题。计算机软硬件技术的发展和提高为遥感数字图像处理提供了重要的技术手段，由于遥感图像比普通数字图像包含更多的信息，如目标物的大小、形状、特征属性，区分各种目标并进行分类等，这就要求将遥感图像信息的获取发展为计算机支持下的遥感图像智能化识别，最终实现遥感图像理解。随着遥感技术在社会的许多领域发挥越来越重要的作用，研究人员对功能强大、使用方便的遥感数据处理软件的需求也在日益增长。如今，国际上最流行的遥感软件有加拿大 PCI公司开发的PCI Geomatica、美国 ERDAS LLC公司开发的ERDAS Imagine以及美国 Research System INC公司开发的ENVI，这些遥感软件虽然功能强大，可以通过简单的菜单操作就可以得到较为理想的结果输出，但却不能记录处理过程，然而很多从事遥感行业的研究员都希望将自己的成果在以论文形式发表的同时也可以以系统和软件的形式得到实际的应用，并为以后新的理论建立一个可扩展的开发平台。这就要求我们自己动手开发一套满足各自需求且实用的遥感软件。此类软件的开发有以下难点：遥感数据格式多样，读取方法难统一；遥感图像容量很大，全图读取或显示十分耗时；各种遥感地理、投影信息的读取；经过复杂图像处理算法后显示输出效率低等。本文介绍一种在Visual C++平台上基于GDAL库，并利用OpenGL图形接口的遥感图像处理类软件的框架设计方法，在此框架的基础上能够扩展出研发人员自己的各种遥感数据处理算法或功能模块，最终形成产品化的软件。 2GDAL库的功能及应用任何图像处理软件的首要工作是能够正确读取数据，遥感图像处理软件也不例外。遥感数据是指太阳辐射、红外、微波等电磁波经过大气层到达地面，被地物反射后再次经过大气层，被遥感传感器接受，并由传感器将这部分能量特征传送回地面的能量特征数据。由于卫星搭载的传感器多种多样，接受数据选择的波段（或者说通道）各不相同，因此传回来的遥感数据格式也是种类繁多，这就给遥感软件读取数据模块的开发带来了不小的难度，然而GDAL库却能轻松解决这一难题。 GDAL(Geospatial Data Abstraction Library)是一个在X/MIT许可协议下的开源栅格空间数据转换库。它利用抽象数据模型来表达所支持的各种文件格式。它还有一系列命令行工具来进行遥感数据转换和处理。有很多著名的GIS类产品都使用了GDAL/OGR库，包括ESRI 的ArcGIS 9.2，Google Earth和跨平台的GRASS GIS系统。GDAL库几乎支持现在所有的遥感数据格式，下表列出了几种常见格式（详细的格式支持见https://www.wendangku.net/doc/a89250976.html,/formats_list.html）：

GDAL_API Tutorial

GDAL API Tutorial Before opening a GDAL supported raster datastore it is necessary to register drivers. 在打开一个GDAL支持的栅格资料之前，必需要注册驱动。 There is a driver for each supported format. 每个驱动对应各自支持的格式。 Normally this is accomplished with the GDALAllRegister() function which attempts to register all known drivers, including those auto-loaded from .so files using GDALDriverManager::AutoLoadDrivers(). 通常这个会被GDALAllRegister（）函数完成，试图去注册所有已知的驱动包括使用GDALDriverManager::AutoLoadDrivers()从.so文件来加载。 If for some applications it is necessary to limit the set of drivers it may be helpful to review the code from gdalallregister.cpp. 如果一些程序有必要去限制驱动集合，检查gdalallregister.cpp的代码将会有所帮助，Python automatically calls GDALAllRegister() when the gdal module is imported. 当gdal模块被导入时，Python会自动调用GDALAllRegister()。 Once the drivers are registered, the application should call the free standing GDALOpen() function to open a dataset, passing the name of the dataset and the access desired (GA_ReadOnly or GA_Update). 一但驱动被注册，程序将会调用独立的GDALOpen（）函数通过dataset的名称和需要的存取方式（GA_ReadOnly或GA_Update）来打开dataset. Note that if GDALOpen() returns NULL it means the open failed, and that an error messages will already have been emitted via CPLError(). 注意如果GDALOpen（）返回NULL，意味着打开失败了，这个错误信息将会通过CPLError （）释放出。 If you want to control how errors are reported to the user review the CPLError() documentation.

Gdalwarp命令解析

Gdalwarp gdalwarp.exe投影转换和投影设置。同时也可以进行图像镶嵌。这个程序可以重新投影所支持的投影，而且如果图像("raw" with)控制信息也可以把GCPs和图像存储在一起用法 gdalwarp [--help-general] [--formats] [-s_srs srs_def] [-t_srs srs_def] [-order n] ] [-tps] [-et err_threshold] [-te xmin ymin xmax ymax] [-tr xres yres] [-ts width height] [-wo "NAME=VALUE"] [-ot Byte/Int16/...] [-wt Byte/Int16] [-srcnodata "value [value...]"] [-dstnodata "value [value...]"] -dstalpha [-rn] [-rb] [-rc] [-rcs] [-wm memory_in_mb] [-multi] [-q] [-of format] [-co "NAME=VALUE"]* srcfile* dstfile 参数解释 ?-s_srs srs_def 源空间参考集。所有可以使用 OGRSpatialReference.SetFromUserInput()来调用的坐标系统都可以使用。包括 EPSG PCS 和GCSes（例如EPSG:4296），PROJ.4 描述。或者包含知名文本的以prf 为扩展名的文件。建议用Proj4 的描述。 ?-t_srs srs_def 目标坐标系统集。（可以参考上面的解释） ?-order n 用于绑定的多项式规则。默认的是选择一个基于GCPs 数量的多项式。 ?-tps 允许根据已有的GCPs 使用薄板内插转换方法。可以用这个来替代-order 参数。 ?-et err_threshold 转换的错误临界的近似值。（以象元单位- 默认到0.125） ?-te xmin ymin xmax ymax 设置被创建的输出文件的地理边界范围 ?-tr xres yres 设置输出文件分辨率（单位以目标地理参考为准） ?-ts width height 设置输出文件大小（以行列多少象元计量） ?-wo "NAME=VALUE" 设置绑定参数。GDALWarpOptions::papszWarpOptions 不会显示所有的参数，多个-wo 参数可以并列。 ?-ot type 设置输出波段的数据类型。 ?-wt type 工作的象元数据类型。包括在源图像和目标图像缓冲中的象元数据类型。 ?-rn 用最临近发进行重采样（默认，最快，但是内插质量最差） ?-rb 用双线性法进行重采样 ?-rc 用立方体发进行重采样 ?-rcs 用立方曲线发进行重采样（最慢的方法）

GDAL安装

https://www.wendangku.net/doc/a89250976.html,/chimneyqin/blog/item/6c785aeac77ef0dfd439c9ef.html 2010-05-19 15:12 最近在学习在VC中调用GDAL库处理遥感影像，现总结如下： 1. GDAL安装（1）下载gdal的安装文件，解压到某目录下如C:\gdal下。这里我们假定VC6的安装在默认目录C:\Program Files\Microsoft Visual Studio8下。（2）启动cmd，即打开控制台窗口。进入VC6的安装目录下，如cd C:\Program Files\Microsoft Visual Studio8\VC\bin\，在此目录下有个文件VCVARS32.BAT，执行该文件。然后重新回到C:\gdal下。运行命令nmake /f makefile.vc。编译完成后，用记事本打开文件C:\gdal\nmake.opt，根据自己的情况修改GDAL_HOME = 这一行，这个指的是最终GDAL的安装目录，比如说我们安装在C:\GDAL，那么这一行就是GDAL_HOME = "C:\GDAL"，在C:\gdalsrc下执行nmake /f makefile.vc install，然后是nmake /f makefile.vc devinstall，然后我们需要的东西就安装到了C:\GDAL下。 2 .GDAL使用（1）在VS2005中新建win32控制台程序 testGDALconsole,将gdal14.dll拷贝到testGDALconsole/debug中。（否则运行时会提示找不到gdal14.dll）（2）Tools/options: 在Library files、Include files中分别添加GDAL的LIB目录和INCLUDE 件目录（也可以直接将gdal_priv.h拷贝到testGDALconsole.cpp所在目录下）

将GDAL编译成C#可用的DLL

目的：将GDAL编译成C#可用的DLL 环境 GDAL-1.10.1 .net framework 4.0 vs 2010 swigwin-2.0.11 代码错误记录 1、函数名称错误

....\GDAL\gdal-1.10.1\swig\csharp\gdal文件夹中的Band.cs、Dataset.cs、Driver.cs三个文件中BandUpcast、DatasetUpcast、DriverUpcast函数名称应分别改为Band_SWIGUpcast、Dataset_SWIGUpcast、Driver_SWIGUpcast。 2、重复定义 ....\GDAL\gdal-1.10.1\swig\csharp\ogr文件夹中OgrPINVOK.cs、OsrPINVOK.cs中第188行有名称为 static OgrPINVOKE() { }、static OsrPINVOKE() {}的函数，此函数重复定义，将重复定义的代码删除。类似的将....\GDAL\gdal-1.10.1\swig\csharp\osr文件夹中OsrPINVOK.cs的代码也改过来。 3、安全透明代码的问题猜测是.net framework 版本过高引起的 ....\GDAL\gdal-1.10.1\swig\csharp\gdal中有很多cs文件，在需要使用到的cs文件中加入

using System.Security; [SecuritySafeCritical] 要注意的是Dataset.cs第52行的Dispose()函数被override了，要在此函数上面，即51行也要写入 [SecuritySafeCritical] 然后就可以编译了XD 编译用vs命令提示(2010) 进入....\GDAL\gdal-1.10.1\ 分别执行 nmake /f makefile.vc nmake /f makefie.vc install nmake /f makefile.vc devinstall 进入....\GDAL\gdal-1.10.1\swig\csharp\ 执行 nmake /f makefile.vc nmake /f makefie.vc install

国外个主流语料库使用

1. The Complete Lexical Tutor http://www.lextutor.ca/ 参考期刊网上刘玉山，胡志军的介绍。是一个语料库中心词索引软件（concordancer）,加拿大魁北克大学Tom Cobb the University of Quebec at Montreal (UQAM), 开发三部分：learners, researchers, teachers自我学习，研究，教师命题。特别是concordance中有13个语料库为检索对象。还可以用来对学生作文中的用词分析。http://www.lextutor.ca/concordancers/concord_e.html 可以同时提供多个语料库的在线搜索，但缺点是每次只能对一个文本加工。 2.BNC 2014年开始，免费获得，通过BYU的申请。 British National Corpus 一亿词，书面语90%，口语10%，共4124篇文本，从1980到1993年的语料英国牛津出版社﹑朗文出版公司﹑钱伯斯—哈洛普出版公司﹑牛津大学计算机服务中心、兰卡斯特大学英语计算机中心以及大英图书馆等联合开发建立的大型语料库共有七类口语spoken,小说fiction，流行杂志magazine，报纸newspaper和学术期刊academic 还有COCA分类中没有的两类non-academic, miscellaneous second edition BNC World (2001) third edition BNC XML Edition (2007) extracts from regional and national newspapers, specialist periodicals and journals for all ages and interests, academic books and popular fiction, published and unpublished letters and memoranda, school and university essays, among many other kinds of text 通常可进入的那个链接是BYU, 美国杨百翰大学提供的 BYU大学在2012年对语料库经行了重新附码，用的CLAWS 7 3. COCA: the corpus of contemporary American English Brigham Young University 美国犹他州杨百翰大学 Doctor Mark Davies 3.6亿，1990-2007年间，美国国内各种语料口语spoken,小说fiction，流行杂志magazine，报纸newspaper和学术期刊academic共五类语料来源且持续更新中，每年以2000万词递增，目前到1990-2012，共4.5亿词 Display:显示方式，compare选项可以用来比较两个词的搭配区别，排列方式选择relevance 相关度标记了语料的时间，便于研究语言历时与共时的变化教学中相关用法查找同义词，如[=scold].[V*]表示查找所有scold的同义词

GDAL开发文档

GDAL开发文档这里列出所有相关的页面： ?GDAL数据模型 ?GDAL驱动实现向导 ?GDAL API入门 ?OGR API 使用向导翻译：柴树杉(chaishushan@https://www.wendangku.net/doc/a89250976.html,) 出处：opencv-extension-library GDAL数据模型翻译：柴树杉(chaishushan@https://www.wendangku.net/doc/a89250976.html,) 原文：https://www.wendangku.net/doc/a89250976.html,/gdal_datamodel.html 该文档简要描述了GDAL的数据模型，该模型可以容纳其他各种信息。数据集（Dataset）一个dataset（对应GDALDataset类）是一个光栅数据以及和它有关系的信息的集合。特别地dataset包含了光栅数据的大小（像素、线等）。dataset同时也为对应的光栅数据指定了坐标系统。dataset本身还可以包含元数据，它们以一种键/值对的方式来组织。 GDAL的数据集是基于OpenGIS Grid Coverages的格式定义的。坐标系统 Dataset的坐标系统由OpenGIS WKT字符串定义，它包含了： ?一个全局的坐标系名称。 ?一个地理坐标系名称。 ?一个基准标识符。 ?椭球体的名字。长半轴(semi-major axis)和反扁率(inverse flattening)。 ?初子午线(prime meridian)名和其与格林威治子午线的偏移值。 ?投影方法类型（如横轴莫卡托）。 ?投影参数列表（如中央经线等）。 ?一个单位的名称和其到米和弧度单位的转换参数。 ?轴线的名称和顺序。 ?在预定义的权威坐标系中的编码（如EPSG）。

更多信息请参考OpenGIS WKT坐标系统定义，以及osr教程文档和OGRSpatialReference类的描述文档。在GDAL中，返回坐标系统的函数是GDALDataset::GetProjectionRef()。它返回的坐标系统描述了地理参考坐标，暗含着仿射地理参考转换，这地理参考转换是由GDALDataset::GetGeoTransform()来返回。由GCPs地理参考坐标描述的坐标系统是由GDALDataset::GetGCPProjection()返回的。注意，返回的坐标系统字符串“”表示未知的地理参考坐标系统。仿射地理变换 GDAL数据集有两种方式描述栅格位置（用点/线坐标系）以及地理参考坐标系之间的关系。第一种也是比较常用的是使用仿射转换，另一种则是GCPs。仿射变换由6个参数构成，它们由GDALDataset::GetGeoTransform()返回它们把点/线坐标，用下面的关系转将点/线影射到地理坐标： Xgeo = GT(0) + Xpixel*GT(1) + Yline*GT(2) Ygeo = GT(3) + Xpixel*GT(4) + Yline*GT(5) 假设影像上面为北方，GT2和GT4参数为0，而GT1是象元宽，GT5是象元高，（GT0，GT3）点位置是影像的左上角。注意，上面所说的点/线坐标系是从左上角(0,0)点到右下角，也就是坐标轴从左到右增长，从上到下增长的坐标系（即影象的行列从左下角开始计算）。点/线位置中心是(0.5,0.5)。 GCPs 数据集可以由一系列控制点来定义空间参考坐标系。所有的GCPs共用一个地理参考坐标系，由GDALDataset::GetGCPProjection()返回。每个GCP（对应GDAL_GCP 类）包含下面内容： typedef struct { char *pszId; char *pszInfo; double dfGCPPixel; double dfGCPLine; double dfGCPX; double dfGCPY; double dfGCPZ; } GDAL_GCP;

GDAL使用方法VC+C#

GDAL栅格图像操作 GDAL是一个操作各种栅格和矢量（由ogr这个库实现）地理数据格式的开源库。包括读取、写入、转换、处理各种栅格和矢量数据格式（有些特定的格式对一些操作如写入等不支持）。即使不是进行地理遥感方面的应用研究，GDAL也是一个非常有用的库，因为它可以支持大量我们常见的图像数据，比如jpg，gif之类的。完整的格式清单可以到此链接查看 https://www.wendangku.net/doc/a89250976.html,/formats_list.html。而且已经有包括GoogleEarth在内的很多软件都是在使用GDAL作为后台库。本文就以VC为开发平台介绍GDAL对栅格数据的操作方法。 Include目录是开发中需要的头文件，lib中是所需要的lib文件，在VC8中应当将其存放目录添加到目录列表中，选择菜单的“工具-选项-项目和解决方案-VC++目录”，分别在“包含文件”和“库文件”中将此两个目录添加进去。在项目的属性页中，选择“配置属性-链接器-输入”，在“附加依赖项”中添加gdal_i-vc8.lib和 gdal_id-vc8.lib两个使用GDAL中需要的静态库文件，或者在程序中添加以下两行代码也可以。 #pragma comment(lib, "gdal_i-vc8.lib") #pragma comment(lib, "gdal_id-vc8.lib") Bin目录下的动态链接库文件应当放置于程序能够访问的位置，比如windows\system32中。此外，在程序中需要引入的头文件是gdal_priv.h。现在开始用C++来对图像文件进行操作。在打开文件之前需要首先注册所需要的驱动程序，一般来说我们可以默认注册所有支持的格式驱动，所使用的函数是GDALAllRegister()。然后就是打开文件操作。这里要说一个数据集的概念，也就是所谓的Dataset。在GDAL中可以说数据的核心就是Dataset，简单来说可以将Dataset就理解为图像文件，比如说一个jpeg格式的文件就是一个数据集，当然其他一些文件格式可能在一个数据集中包含多于一个文件，比如可能除了图像数据文件外还可能会有一些附加信息文件等。在数据集下最重要组成部分就是所谓的波段band，波段可多可少，比如一个RGB真彩色的图像就有3个波段，分别代表红色绿色和蓝色波段，如果是灰度图，那可能就只有一个波段，而很多遥感图像可能就会多于3个波段。除了波段外，数据集中还含有图像相关的坐标系投影信息，元数据信息等数据。文件的打开使用的是GDALOpen ( const char * pszFilename, GDALAccess eAccess )，pszFilename是文件路径，eAccess 是访问权限，可以是GA_ReadOnly只读，也可以是GA_Update 来对文件进行修改。比如我们以只读模式打开一个tif文件： GDALDataset *poDataset; //数据集对象指针 GDALAllRegister();//注册驱动 poDataset = (GDALDataset *) GDALOpen( "c:\\terra335h_EV_250_Aggr500_RefSB_b0.tif", GA_ReadOnly ); if( poDataset != NULL /*检查是否正常打开文件*/) { //do something } delete poDataset; //释放资源

中国语料库研究的历史与现状

中国语料库研究的历史与现状语言学的研究必须以语言事实作为根据，必须详尽地、大量地占有材料，才有可能在理论上得出比较可靠的结论。传统的语言材料的搜集、整理和加工完全是靠手工进行的，这是一种枯燥无味、费力费时的工作。计算机出现后，人们可以把这些工作交给计算机去作，大大地减轻了人们的劳动。后来，在这种工作中逐渐创造了一整套完整的理论和方法，形成了一门新的学科——语料库语言学（corpus linguistics），并成为了自然语言处理的一个分支学科。语料库语言学主要研究机器可读自然语言文本的采集、存储、检索、统计、语法标注、句法语义分析，以及具有上述功能的语料库在语言定量分析、词典编纂、作品风格分析、自然语言理解和机器翻译等领域中的应用。多年来，机器翻译和自然语言理解的研究中, 分析语言的主要方法是句法语义分析。因此，在很长一段时间内，许多系统都是基于规则的，而根据当前计算机的理论和技术的水平很难把语言学的各种事实和理解语言所需的广泛的背景知识用规则的形式充分地表达出来，这样，这些基于规则的机器翻译和自然语言理解系统只能在极其受限的某些子语言（sub- language）中获得一定的成功。为了摆脱困境，自然语言处理的研究者者们开始对大规模的非受限的自然语言进行调查和统计，以便采用一种基于统计的模型来处理大量的非受限语言。不言而喻，语料库语言学将有可能在大量语言材料的基础上来检验传统的理论语言学基于手工搜集材料的方法所得出的各种结论，从而使我们对于自然语言的各种复杂现象获得更为深刻全面的认识。本文首先简要介绍国外语料库的发展情况，然后，比较详细地介绍中国语料库的发展情况和主要的成绩，使我们对于语料库研究得到一个鸟瞰式的认识。一、国外语料库概况现在，美国Brown大学建立了BROWN语料库（布朗语料库），英国Lancaster大学与挪威Oslo大学与Bergen大学联合建立了 LOB 语料库。欧美各国学者利用这两个语料库开展了大规模的研究，其中最引人注目的是对语料库进行语法标注的研究。他们设计了基于规则的自动标注系统 TAGGIT 来给布朗语料库的 100 万词的语料作自动标注，正确率为77%. 他们还设计了 CLAWS 系统来给 LOB 语料库的100万词的语料作自动标注，根据统计信息来建立算法，自动标注正确率达 96%, 比基于规则的 TAGGIT 系统提高了将近 20%. 最近他们同时考察三个相邻标记的同现频率，使自动语法标注的正确率达到 99.5%。这个指标已经超过了人工标注所能达到的最高正确率。

GDAL API入门

GDAL API入门打开文件在打开GDAL所支持的光栅数据之前需要注册驱动。这里的驱动是针对GDAL支持的所有数据格式。通常可以通过调用GDALAllRegister()函数来注册所有已知的驱动，同时也包含那些用GDALDriverManager::AutoLoadDrivers()从.so文件中自动装载驱动。如果程序需要对某些驱动做限制，可以参考gdalallregister.cpp代码。当驱动被注册之后，我们就可以用GDALOpen()函数来打开一个数据集。打开的方式可以是GA_ReadOnly或者GA_Update。 In C++: #include "gdal_priv.h" int main() { GDALDataset *poDataset; GDALAllRegister(); poDataset = (GDALDataset *) GDALOpen( pszFilename, GA_ReadOnly ); if( poDataset == NULL ) { ...; } In C: #include "gdal.h" int main() { GDALDatasetH hDataset; GDALAllRegister(); hDataset = GDALOpen( pszFilename, GA_ReadOnly ); if( hDataset == NULL ) { ...; } 如果GDALOpen()函数返回NULL则表示打开失败，同时CPLError()函数产生相应的错误信息。如果您需要对错误进行处理可以参考CPLError()相关文档。通常情况下，所有的GDAL函数都通过CPLError()报告错误。另外需要注意的是pszFilename并不一定对应一个实际的文件名（当然也可以就是一个文件名）。它的具体解释由相应的驱动程序负责。它可能是一个URL，或者是文件名以后后面带有许多用于控制打开方式的参数。通常建议，不要在打开文件的选择对话框中对文件的类型做太多的限制。获取Dataset信息如果GDAL数据模型一节所描述的，一个GDALDataset包含了光栅数据的一系列的波段信息。同时它还包含元数据、一个坐标系统、投影类型、光栅的大小以及其他许多信息。

GDAL库学习笔记

ZION GDAL库学习笔记作者：lilin 文章来源：https://www.wendangku.net/doc/a89250976.html,/ 如果您发现我写的东西中有问题，或者您对我写的东西有意见，请一定要发邮件跟我讲，Email( linux_23@https://www.wendangku.net/doc/a89250976.html, )

GDAL库学习笔记(一): GDAL库介绍 1. 介绍可能你不玩GIS，不懂这个库到底有什么用，或者和python有什么关系。但是你要玩GIS，RS，你就应当知道这个库的价值。就算你不玩GIS，我想这个库对你也应该有致命的吸引力。为什么？看下面的介绍吧！先看看这段GDAL主页上的英文介绍吧！ is a translator library for raster geospatial data formats that is released under anX/MITstyleOpen Sourcelicense by theOpen Source Geospatial Foundation. As a library, it presents asingle abstract data modelto the calling application for all supported formats. It also comes with a variety of usefulcommandline utilitiesfor data translation and processing. 简单地说，GDAL是一个操作各种栅格地理数据格式的库。包括读取、写入、转换、处理各种栅格数据格式（有些特定的格式对一些操作如写入等不支持）。它使用了一个单一的抽象数据模型就支持了大多数的栅格数据（GIS对栅格，矢量，3D数据模型的抽象能力实在令人叹服）。当然除了栅格操作，这个库还同时包括了操作矢量数据的另一个有名的库ogr （ogr这个库另外介绍），这样这个库就同时具备了操作栅格和矢量数据的能力，买一送一，这么合算的买卖为什么不做。最最最重要的是这个库是跨平台的，开源的！如今这个库对各种数据格式的支持强大到令人啧啧的地步了。如果你对他的强大有什么怀疑的话，看看这里一大串的GDAL所支持格式清单，吓到了吧！再看看它的主页最后那些使用了它作为底层数据处理的软件列表吧！其中你可以不知道GRASS，你也可以不知道Quantum GIS (QGIS)，但是你总该知道Google Earth吧！不知道？赶快下一个去玩玩－－会当临绝顶，一览众山小！有人说我又不玩GIS。不错，但是，你即使不玩GIS，这个库也是满有用的。首先，哪个库支持这么多栅格（图片）格式，哪个库在C/C++/python/ruby/VB/java/C#(这个暂时不完全支持)下都能用，而且都一样用？退一步讲，3S软件又不一定要用在3S下（很多医学影像就是用PCI软件来处理的）。再退一步，你的生活即使和3S一点关系都没有，栅格数据又不单单只有GIS下才用到。你大可用这个库来读取jpg，gif，tif，xpm 等格式。而且对各种格式支持得不是一般的好，很大一部分非标准格式照样支持得非常好。我曾经在java 下玩过jai，以及一系列jai的扩展库，一些图像格式在很多图片浏览器中都可以正确读取（有的甚至不是非标准格式），用jai死活就读不出来！这个库的python版和其他的python库结合的很好。最直接、明显的支持是使用Numeric库来进行数据读取和操作。各种矩阵魔术可以发挥得淋漓尽致（图像其实就是矩阵）。而且按我的观点，python对矩阵的操作比其他的语言有明显的优势。写出来的东西比其他语言写出来的短小的多，而且好看得多。并且python 的弱类型在处理栅格数据格式类型的时候代码量比强类型的语言少了数倍（不用double，byte，short等等分开处理，这简直就是先天上的优势）。所以我就喜欢用python做图像的处理。所以就连GIS界的微软ESRI 也直接在ARCGIS9中用python来作栅格数据的导入导出。一句话，真是太方便啦！ 2. 安装 2.1. windows下的安装官方安装文档在这里。下面是我自己的实践步骤：先去https://www.wendangku.net/doc/a89250976.html,/dl/下一个版本，解压。打开控制台，输入： “D:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin\vcvars32.bat" 注册vc的编译环境。打开gdal文件夹下的nmake.opt修改GDAL_HOME = "C:\warmerda\bld"把路径改到需要把gdal安装的

GDAL_CSharp环境配置

一、GDAL C# DLL下载 https://www.wendangku.net/doc/a89250976.html,/sdk/ https://www.wendangku.net/doc/a89250976.html,/sdk/PackageList.aspx?file=release-1400-gdal-1-10-1-mapserver-6-4-1.zip 位于压缩包中的位置：bin\gdal\csharp\...目录下：开发时把以_csharp.dll结尾的添加到项目引用，其余的拷贝到bebug目录下。二、在调用Gdal.AllRegister()方法时报异常：“OSGeo.GDAL.GdalPINVOKE”的类型初始值设定项引发异常。原因分析：gdal初始化时，其依赖dll项不全导致异常，可采用Dependency Walker工具查看相关依赖项。把九个DLL拷贝到debug是不能解决问题的。解决方法：采用SharpMap的GDAL初始化方法，需要两个数据： 1.GdalConfiguration.cs 2.gdal_data_config.rar 第一步：将GdalConfiguration.cs添加到项目中，然后解压gdal_data_config.rar到debug目录下，文件夹名称为gdal。第二步：在使用Gdal.AllRegister()初始化前，调用以下两句代码进行相关初始化数据的配置即可。SharpMap.GdalConfiguration.ConfigureGdal(); SharpMap.GdalConfiguration.ConfigureOgr();

附件1：GdalConfiguration.cs /****************************************************************************** * * Name: GdalConfiguration.cs.pp * Project: GDAL CSharp Interface * Purpose: A static configuration utility class to enable GDAL/OGR. * Author: Felix Obermaier * *****************************************************************************/ using System; using System.IO; using System.Reflection; using Gdal = OSGeo.GDAL.Gdal; using Ogr = OSGeo.OGR.Ogr; namespace SharpMap { public static partial class GdalConfiguration { private static bool _configuredOgr; private static bool _configuredGdal; ///

/// Function to determine which platform we're on ///

private static string GetPlatform() { return IntPtr.Size == 4 ? "x86" : "x64"; } ///

/// Construction of Gdal/Ogr ///

static GdalConfiguration() { var executingAssemblyFile = new Uri(Assembly.GetExecutingAssembly().GetName().CodeBase).LocalPath; var executingDirectory = Path.GetDirectoryName(executingAssemblyFile); if (string.IsNullOrEmpty(executingDirectory)) throw new InvalidOperationException("cannot get executing directory"); var gdalPath = https://www.wendangku.net/doc/a89250976.html,bine(executingDirectory, "gdal"); var nativePath = https://www.wendangku.net/doc/a89250976.html,bine(gdalPath, GetPlatform()); // Prepend native path to environment path, to ensure the // right libs are being used. var path = Environment.GetEnvironmentVariable("PATH"); path = nativePath + ";" + https://www.wendangku.net/doc/a89250976.html,bine(nativePath, "plugins") + ";" + path; Environment.SetEnvironmentVariable("PATH", path); // Set the additional GDAL environment variables. var gdalData = https://www.wendangku.net/doc/a89250976.html,bine(gdalPath, "data"); Environment.SetEnvironmentVariable("GDAL_DATA", gdalData); Gdal.SetConfigOption("GDAL_DATA", gdalData); var driverPath = https://www.wendangku.net/doc/a89250976.html,bine(nativePath, "plugins"); Environment.SetEnvironmentVariable("GDAL_DRIVER_PATH", driverPath); Gdal.SetConfigOption("GDAL_DRIVER_PATH", driverPath); Environment.SetEnvironmentVariable("GEOTIFF_CSV", gdalData); Gdal.SetConfigOption("GEOTIFF_CSV", gdalData);

GDAL库安装

简单介绍：OGR是一个读取和处理GSI矢量数据的库。这个库可以读取和处理多种流行的矢量数据，OGR是GDAL(https://www.wendangku.net/doc/a89250976.html,/)的一个部分，只要你安装了GDAL库，就已经拥有了OGR库。一、安装： 1.先下载一个GDAL版本（C++）（https://www.wendangku.net/doc/a89250976.html,/gdal/wiki/BuildHints）。 2.然后打开控制（DOS）台，找到.....\Microsoft Visual Studio .NET 2010\Vcbin\vcvars32.bat，注册VC编译环境。

进入VS安装目录执行VS目录下的VCV ARS32 文件 3.然后把GDAL库放到一个目录下，如C:\gdal-1.9.1\gdal-1.9.1;用VS打开并打开文件夹下的nmake.opt修改GDAL_HOME = "C:\GDAL"把路径改到需要把gdal安装的地方。cd进入刚才源文件的解压目录C:\gdal-1.9.1\gdal-1.9.1。将gdal解压到C盘

修改nmake cd进入解压目录4.然后在DOS中依次输入： nmake /f makefile.vc nmake /f makefile.vc install

nmake /f makefile.vc devinstall 中间等待编译处理。处理完后系统将会把我们需要的文件拷贝到开始设定的安装目录，如刚才设置的C:\GDAL 二、应用 1.在新建项目下：

属性->C/C++->常规->附加包含目录："C:\GADL\include"。属性->链接器->常规->附加库目录:"C:\GADL\lib". 属性->连接器->输入->附加依赖项:gdal_i.lib. 2. 将C:\GDAL\bin\gdal14.dll拷贝到vs新建项目的debug文件夹中。（否则运行时会提示找不到gdal14.dll）添加所需的头文件就可以用了。 OGR参考（https://www.wendangku.net/doc/a89250976.html,/ogr/index.html）三、实例下面附上一个一个GDAL的读写数据例子引自https://www.wendangku.net/doc/a89250976.html,/tangnf/archive/2008/10/26/3152538.aspx // #include "stdafx.h" #include "fangshibo.h"

国家语委现代汉语语料库介绍-cssn

国家语委现代汉语语料库介绍肖航教育部语言文字应用研究所 2012

语料库建设 ?国家语委语料库建设 ?1991年12月国家语言文字工作委员会提出立项； ?1992年4月召开现代汉语语料库选材原则专家论证会； ?1993年1月制订《现代汉语语料库选材原则》； ?1993年9月召开现代汉语语料库选材专家审定会； ?1998年底建成 7000万字的生语料库； ?目前已完成1亿字生语料和5000万字标注语料； ?语料库建设和加工工作还在继续进行。 ?被列为国家语委“九五”、“十五”科研重大项目 ?得到国家科技部“863”、“973”计划多个项目的支持 ?“智能中文信息处理平台” ?“图像、语音和自然语言理解” ?“中文信息处理应用基础研究”

生语料库语料库的主要内容 ?未经标注加工的生语料库 ?标注语料库 ?词语切分 ?词类标注 ?句法树库 ?内部结构 ?外部功能 ?分词词表 ?88000词条 ?词性标注 ?频率信息 ?语料库加工标注规范 ?语料库软件工具标注语料库句法树库

语料库的主要用途 ?主要用途 ?语言文字的信息处理 ?语言文字规范和标准的制定 ?语言文字的学术研究 ?语文教育 ?语言文字的社会应用

语料来源 ?1993年以前的语料 ?以人工录入印刷版本的语料为主 ?约7000万字 ?1993～2002年的语料 ?部分采用人工录入印刷版本语料 ?约1500万字 ?部分来源自网络电子文本 ?约1500万字 ?2002以后的语料 ?以网络电子文本为主 ?约1000万字

语料分类 ?三个主要类别 ?人文与社会科学类 ?包括政法、历史、社会、经济、文学、艺术等类别语言材料 ?自然科学类 ?自然科学的语言材料（含农业、工业、医学、电子、工程技术等），涉及科学技术发展的各个领域。 ?综合类 ?应用文 ?难于归类的语料