pycsodata Documentation

pycsodata is an unofficial Python package for reading datasets published by the Central Statistics Office of Ireland, using the PxStat RESTful API. Much of its functionality is based on the CSO’s existing csodata R package, while also including automatic merging of datasets with spatial data where available.

Installation

Installation is via pip:

pip install pycsodata

Tutorial

A walkthrough of package usage can be found in the tutorial.

Reference

For further information, on package usage, see the documentation.

Notes

  • By default, the PxStat API metadata links CSO datasets to generalised versions of the spatial GeoJSON files rather than to files containing the most precise ungeneralised geometries. This reduces the size of downloads, and the generalised geometries should be adequate for most purposes (such as creating visualisations). In cases where more detailed spatial analysis is required, the ungeneralised spatial data can be downloaded from GeoHive.
  • There are a few CSO datasets which clearly have a spatial dimension (such as county, area of residence, or similar), but whose metadata does not include a link to a spatial data file. In these cases pycsodata will not be able to produce a GeoDataFrame and will raise an error when .gdf() is called. In most such cases the (generalised or ungeneralised) spatial data can be downloaded from GeoHive and manually merged with the DataFrame produced by pycsodata.
  • The default coordinate reference system (CRS) of the spatial data is the World Geodetic System (EPSG:4326). This should be reprojected to a geographic CRS such as Irish Transverse Mercator (EPSG:2157) before doing any distance or area calculations. For a geopandas GeoDataFrame, this is achieved by calling gdf.to_crs(epsg=2157).

Code Provenance and AI Disclosure

The initial implementation of this package was written by the author (as was the README, the tutorial, and all of this page). AI assistance from Claude Opus 4.5, Claude Sonnet 4.5, and GPT-5.2 was used for refactoring, adding additional functions for caching, searching and sanitising, creating unit tests, and writing comprehensive docstrings. All code was manually reviewed and tested by the author.

Much of the functionality of pycsodata is based on the CSO’s official csodata R package. It acts as a Python wrapper for accessing the CSO’s PxStat RESTful API, and makes use of the pyjstat library. This site was created using Quarto, and the documentation was generated using quartodoc.