OCDS Kingfisher Colab 0.3.0

PyPI Version Build Status Coverage Status Python Version

A set of utility functions for Google Colaboratory notebooks using OCDS data.

If you are viewing this on GitHub, open the full documentation for additional details.

Troubleshooting

If you are using Kingfisher Colab in a Jupyter Notebook (not on Google Colaboratory), you need to:

  1. Install the google-colab package:

    pip install google-colab
    
  2. Upgrade the ipykernel package:

    pip install --upgrade ipykernel
    

API

ocdskingfishercolab.authenticate_gspread()[source]

Authenticates the current user and gives the notebook permission to connect to Google Spreadsheets.

Returns:a Google Sheets Client instance
Return type:gspread.Client
ocdskingfishercolab.authenticate_pydrive()[source]

Authenticates the current user and gives the notebook permission to connect to Google Drive.

Returns:a GoogleDrive instance
Return type:pydrive.drive.GoogleDrive
ocdskingfishercolab.set_spreadsheet_name(name)[source]

Sets the name of the spreadsheet to which to save.

Used by ocdskingfishercolab.save_dataframe_to_sheet().

Parameters:name (str) – a spreadsheet name
ocdskingfishercolab.list_source_ids(pattern='')[source]

Returns, as a ResultSet or DataFrame, a list of source IDs matching the given pattern.

Parameters:pattern (str) – a substring, like “paraguay”
Returns:the results as a pandas DataFrame or an ipython-sql ResultSet, depending on whether %config SqlMagic.autopandas is True or False respectively. This is the same behaviour as ipython-sql’s %sql magic.
Return type:pandas.DataFrame or sql.run.ResultSet
ocdskingfishercolab.list_collections(source_id)[source]

Returns, as a ResultSet or DataFrame, a list of collections with the given source ID.

Parameters:source_id (str) – a source ID
Returns:the results as a pandas DataFrame or an ipython-sql ResultSet, depending on whether %config SqlMagic.autopandas is True or False respectively. This is the same behaviour as ipython-sql’s %sql magic.
Return type:pandas.DataFrame or sql.run.ResultSet
ocdskingfishercolab.set_search_path(schema_name)[source]

Sets the search_path to the given schema, followed by the public schema.

Parameters:schema_name (str) – a schema name
ocdskingfishercolab.save_dataframe_to_sheet(dataframe, sheetname, prompt=True)[source]

Saves a data frame to a worksheet in Google Sheets, after asking the user for confirmation.

Use ocdskingfishercolab.set_spreadsheet_name() to set the spreadsheet name.

Parameters:
  • dataframe (pandas.DataFrame) – a data frame
  • sheetname (str) – a sheet name
  • prompt (bool) – whether to prompt the user
ocdskingfishercolab.save_dataframe_to_spreadsheet(dataframe, name)[source]

Dumps the release_package column of a data frame to a JSON file, converts the JSON file to an Excel file, and uploads the Excel file to Google Drive.

Parameters:
  • dataframe (pandas.DataFrame) – a data frame
  • name (str) – the basename of the Excel file to write
ocdskingfishercolab.download_dataframe_as_csv(dataframe, filename)[source]

Converts the data frame to a CSV file, and invokes a browser download of the CSV file to your local computer.

Parameters:
  • dataframe (pandas.DataFrame) – a data frame
  • filename (str) – a file name
ocdskingfishercolab.download_data_as_json(data, filename)[source]

Dumps the data to a JSON file, and invokes a browser download of the CSV file to your local computer.

Parameters:
  • data – JSON-serializable data
  • filename (str) – a file name
ocdskingfishercolab.get_ipython_sql_resultset_from_query(sql)[source]

Executes a SQL statement and returns a ResultSet.

Parameters are taken from the scope this function is called from (same behaviour as ipython-sql’s %sql magic).

Parameters:sql (str) – a SQL statement
Returns:the results as a ResultSet
Return type:sql.run.ResultSet
ocdskingfishercolab.download_package_from_query(sql, package_type=None)[source]

Executes a SQL statement that SELECTs only the data column of the data table, and invokes a browser download of the packaged data to your local computer.

Parameters:
  • sql (str) – a SQL statement
  • package_type (str) – “record” or “release”
Raises:

UnknownPackageTypeError – when the provided package type is unknown

ocdskingfishercolab.download_package_from_ocid(collection_id, ocid, package_type)[source]

Selects all releases with the given ocid from the given collection, and invokes a browser download of the packaged releases to your local computer.

Parameters:
  • collection_id (int) – a collection’s ID
  • ocid (str) – an OCID
  • package_type (str) – “record” or “release”
Raises:

UnknownPackageTypeError – when the provided package type is unknown

ocdskingfishercolab.write_data_as_json(data, filename)[source]

Dumps the data to a JSON file.

Parameters:
  • data – JSON-serializable data
  • filename (str) – a file name
exception ocdskingfishercolab.OCDSKingfisherColabError[source]

Base class for exceptions from within this package

exception ocdskingfishercolab.UnknownPackageTypeError[source]

Raised when the provided package type is unknown