OCDS Kingfisher Colab 0.3.0¶
A set of utility functions for Google Colaboratory notebooks using OCDS data.
If you are viewing this on GitHub, open the full documentation for additional details.
Troubleshooting¶
If you are using Kingfisher Colab in a Jupyter Notebook (not on Google Colaboratory), you need to:
Install the
google-colab
package:pip install google-colab
Upgrade the
ipykernel
package:pip install --upgrade ipykernel
API¶
-
ocdskingfishercolab.
authenticate_gspread
()[source]¶ Authenticates the current user and gives the notebook permission to connect to Google Spreadsheets.
Returns: a Google Sheets Client instance Return type: gspread.Client
-
ocdskingfishercolab.
authenticate_pydrive
()[source]¶ Authenticates the current user and gives the notebook permission to connect to Google Drive.
Returns: a GoogleDrive instance Return type: pydrive.drive.GoogleDrive
-
ocdskingfishercolab.
set_spreadsheet_name
(name)[source]¶ Sets the name of the spreadsheet to which to save.
Used by
ocdskingfishercolab.save_dataframe_to_sheet()
.Parameters: name (str) – a spreadsheet name
-
ocdskingfishercolab.
list_source_ids
(pattern='')[source]¶ Returns, as a ResultSet or DataFrame, a list of source IDs matching the given pattern.
Parameters: pattern (str) – a substring, like “paraguay” Returns: the results as a pandas DataFrame or an ipython-sql ResultSet, depending on whether %config SqlMagic.autopandas
isTrue
orFalse
respectively. This is the same behaviour as ipython-sql’s%sql
magic.Return type: pandas.DataFrame or sql.run.ResultSet
-
ocdskingfishercolab.
list_collections
(source_id)[source]¶ Returns, as a ResultSet or DataFrame, a list of collections with the given source ID.
Parameters: source_id (str) – a source ID Returns: the results as a pandas DataFrame or an ipython-sql ResultSet, depending on whether %config SqlMagic.autopandas
isTrue
orFalse
respectively. This is the same behaviour as ipython-sql’s%sql
magic.Return type: pandas.DataFrame or sql.run.ResultSet
-
ocdskingfishercolab.
set_search_path
(schema_name)[source]¶ Sets the search_path to the given schema, followed by the
public
schema.Parameters: schema_name (str) – a schema name
-
ocdskingfishercolab.
save_dataframe_to_sheet
(dataframe, sheetname, prompt=True)[source]¶ Saves a data frame to a worksheet in Google Sheets, after asking the user for confirmation.
Use
ocdskingfishercolab.set_spreadsheet_name()
to set the spreadsheet name.Parameters: - dataframe (pandas.DataFrame) – a data frame
- sheetname (str) – a sheet name
- prompt (bool) – whether to prompt the user
-
ocdskingfishercolab.
save_dataframe_to_spreadsheet
(dataframe, name)[source]¶ Dumps the
release_package
column of a data frame to a JSON file, converts the JSON file to an Excel file, and uploads the Excel file to Google Drive.Parameters: - dataframe (pandas.DataFrame) – a data frame
- name (str) – the basename of the Excel file to write
-
ocdskingfishercolab.
download_dataframe_as_csv
(dataframe, filename)[source]¶ Converts the data frame to a CSV file, and invokes a browser download of the CSV file to your local computer.
Parameters: - dataframe (pandas.DataFrame) – a data frame
- filename (str) – a file name
-
ocdskingfishercolab.
download_data_as_json
(data, filename)[source]¶ Dumps the data to a JSON file, and invokes a browser download of the CSV file to your local computer.
Parameters: - data – JSON-serializable data
- filename (str) – a file name
-
ocdskingfishercolab.
get_ipython_sql_resultset_from_query
(sql)[source]¶ Executes a SQL statement and returns a ResultSet.
Parameters are taken from the scope this function is called from (same behaviour as ipython-sql’s
%sql
magic).Parameters: sql (str) – a SQL statement Returns: the results as a ResultSet Return type: sql.run.ResultSet
-
ocdskingfishercolab.
download_package_from_query
(sql, package_type=None)[source]¶ Executes a SQL statement that SELECTs only the
data
column of thedata
table, and invokes a browser download of the packaged data to your local computer.Parameters: - sql (str) – a SQL statement
- package_type (str) – “record” or “release”
Raises: UnknownPackageTypeError – when the provided package type is unknown
-
ocdskingfishercolab.
download_package_from_ocid
(collection_id, ocid, package_type)[source]¶ Selects all releases with the given ocid from the given collection, and invokes a browser download of the packaged releases to your local computer.
Parameters: - collection_id (int) – a collection’s ID
- ocid (str) – an OCID
- package_type (str) – “record” or “release”
Raises: UnknownPackageTypeError – when the provided package type is unknown
-
ocdskingfishercolab.
write_data_as_json
(data, filename)[source]¶ Dumps the data to a JSON file.
Parameters: - data – JSON-serializable data
- filename (str) – a file name