Glossary

Open Data Glossary

API – An API (Application Programming Interface) allows data products or services to talk to other products or services.

CSV – CSV (Comma Separated Values) file is a plain text file that stores spreadsheet or basic tabular information in a very simple format.

Data – Facts and statistics collected together for reference or analysis.

Dataset – A dataset is an organized collection of data. The most basic representation of a dataset is a table. Each column of the table represents a particular variable. Each row corresponds to a given value of that column’s variable.

Data automation – An automatic process for storing, transmitting, and presenting of data.

ETL – ETL is short for “Extract, Transform, Load,” which are three database functions that combine into one tool to pull data out of one database and place it into another.

Filter – A way to narrow a search using specified conditions.

Flat file – A file of data that does not contain links to other files.

Geospatial – Used to indicate that data that has a geographic component to it. This means that the records in a dataset have locational information such as coordinates, address, city, or ZIP code.

GIS – Short for Geographic Information System. GIS is stored in layers of spatial information in a way that can be be created, stored, manipulated, analyzed, and mapped.

Metadata – Metadata is data that describes data. Metadata may describe how data is represented, ranges of acceptable values, and its relationship to other data. Metadata also may provide other relevant information, such as the person responsible for it, associated laws and regulations, and the access management policy.

Shapefile – A digital vector storage format for storing geographic information. Shapefiles can support point, line, and area features.

Source System – The Source System, or System of Record, is the information storage system that is the authoritative data source for a given data element or piece of information.

SQL – Structured Query Language (SQL) is a special-purpose programming language responsible for querying and editing information stored in a certain database management system.