External Tools

External tools can provide additional features that are not part of the Dataverse Software itself, such as data file previews, visualization, and curation.

Inventory of External Tools

Tool

Type

Scope

Description

Data Explorer

explore

file

A GUI which lists the variables in a tabular data file allowing searching, charting and cross tabulation analysis. See the README.md file at https://github.com/scholarsportal/dataverse-data-explorer-v2 for the instructions on adding Data Explorer to your Dataverse.

Whole Tale

explore

dataset

A platform for the creation of reproducible research packages that allows users to launch containerized interactive analysis environments based on popular tools such as Jupyter and RStudio. Using this integration, Dataverse users can launch Jupyter and RStudio environments to analyze published datasets. For more information, see the Whole Tale User Guide.

Binder

explore

dataset

Binder allows you to spin up custom computing environments in the cloud (including Jupyter notebooks) with the files from your dataset. See https://github.com/IQSS/dataverse-binder-redirect for installation instructions.

File Previewers

explore

file

A set of tools that display the content of files - including audio, html, Hypothes.is annotations, images, PDF, Markdown, text, video, tabular data, spreadsheets, GeoJSON, zip, and NcML files - allowing them to be viewed without downloading the file. The previewers can be run directly from github.io, so the only required step is using the Dataverse API to register the ones you want to use. Documentation, including how to optionally brand the previewers, and an invitation to contribute through github are in the README.md file. Initial development was led by the Qualitative Data Repository and the spreasdheet previewer was added by the Social Sciences and Humanities Open Cloud (SSHOC) project. https://github.com/gdcc/dataverse-previewers

Data Curation Tool

configure

file

A GUI for curating data by adding labels, groups, weights and other details to assist with informed reuse. See the README.md file at https://github.com/scholarsportal/Dataverse-Data-Curation-Tool for the installation instructions.

Ask the Data

query

file

Ask the Data is an experimental tool that allows you ask natural language questions about the data contained in Dataverse tables (tabular data). See the README.md file at https://github.com/IQSS/askdataverse/tree/main/askthedata for the instructions on adding Ask the Data to your Dataverse installation.

TurboCurator by ICPSR

configure

dataset

TurboCurator generates metadata improvements for title, description, and keywords. It relies on open AI’s ChatGPT & ICPSR best practices. See the TurboCurator Dataverse Administrator page for more details on how it works and adding TurboCurator to your Dataverse installation.

JupyterHub

explore

file

The Dataverse-to-JupyterHub Data Transfer Connector is a tool that simplifies the transfer of data between Dataverse repositories and the cloud-based platform JupyterHub. It is designed for researchers, scientists, and data analysts, facilitating collaboration on projects by seamlessly moving datasets and files. The tool is a lightweight client-side web application built using React and relies on the Dataverse External Tool feature, allowing for easy deployment on modern integration systems. Currently optimized for small to medium-sized files, future plans include extending support for larger files and signed Dataverse endpoints. For more details, you can refer to the external tool manifest: https://forgemia.inra.fr/dipso/eosc-pillar/dataverse-jupyterhub-connector/-/blob/master/externalTools.json

3DViewer by openforestdata.pl

explore

file

The 3DViewer by openforestdata.pl can be used to explore 3D files (e.g. STL format). It was presented by Kamil Guryn during the 2020 community meeting (slide deck, video) and can be found at https://github.com/OpenForestData/open-forest-data-previewers

Managing External Tools

Adding External Tools to a Dataverse Installation

To add an external tool to your Dataverse installation you must first download a JSON file for that tool, which we refer to as a “manifest”. It should look something like this:

{
  "displayName": "Fabulous File Tool",
  "description": "A non-existent tool that is fabulous fun for files!",
  "toolName": "fabulous",
  "scope": "file",
  "types": [
    "explore",
    "preview"
  ],
  "toolUrl": "https://fabulousfiletool.com",
  "contentType": "text/tab-separated-values",
  "httpMethod":"GET",
  "toolParameters": {
    "queryParameters": [
      {
        "fileid": "{fileId}"
      },
      {
        "datasetPid": "{datasetPid}"
      },
      {
        "locale":"{localeCode}"
      }
    ]
  },
  "allowedApiCalls": [
    {
      "name":"retrieveDataFile",
      "httpMethod":"GET",
      "urlTemplate":"/api/v1/access/datafile/{fileId}",
      "timeOut":270
    }
  ]
}

Go to Inventory of External Tools and download a JSON manifest for one of the tools by following links in the description to installation instructions.

Configure the tool with the curl command below, making sure to replace the fabulousFileTool.json placeholder for name of the JSON manifest file you downloaded.

curl -X POST -H 'Content-type: application/json' http://localhost:8080/api/admin/externalTools --upload-file fabulousFileTool.json

Listing All External Tools in a Dataverse Installation

To list all the external tools that are available in a Dataverse installation:

curl http://localhost:8080/api/admin/externalTools

Showing an External Tool in a Dataverse Installation

To show one of the external tools that are available in a Dataverse installation, pass its database id:

export TOOL_ID=1
curl http://localhost:8080/api/admin/externalTools/$TOOL_ID

Removing an External Tool From a Dataverse Installation

Assuming the external tool database id is “1”, remove it with the following command:

export TOOL_ID=1
curl -X DELETE http://localhost:8080/api/admin/externalTools/$TOOL_ID

Testing External Tools

Once you have added an external tool to your Dataverse installation, you will probably want to test it to make sure it is functioning properly.

File Level vs. Dataset Level

File level tools are specific to the file type (content type or MIME type). For example, a tool may work with PDFs, which have a content type of “application/pdf”.

In contrast, dataset level tools are always available no matter what file types are within the dataset.

File Level Explore Tools

File level explore tools provide a variety of features from data visualization to statistical analysis.

For each supported file type, file level explore tools appear in the file listing of the dataset page as well as under the “Access” button on each file page.

File Level Preview Tools

File level preview tools allow the user to see a preview of the file contents without having to download it.

When a file has a preview available, a preview icon will appear next to that file in the file listing on the dataset page. On the file page itself, the preview will appear in a Preview tab (renamed File Tools, if multiple tools are available) either immediately or once a guestbook has been filled in or terms, if any, have been agreed to.

File Level Query Tools

File level query tools allow the user to ask questions (e.g. natural language queries) of a data table’s contents without having to download it.

When a file has a query tool available, a query icon will appear next to that file in the file listing on the dataset page. On the file page itself, the query tool will appear in a Query tab (renamed File Tools, if multiple tools are available) either immediately or once a guestbook has been filled in or terms, if any, have been agreed to.

File Level Configure Tools

File level configure tools are only available when you log in and have write access to the file. The file type determines if a configure tool is available. For example, a configure tool may only be available for tabular files.

Dataset Level Explore Tools

Dataset level explore tools allow the user to explore all the files in a dataset.

Dataset Level Configure Tools

Dataset level configure tools can be launched by users who have edit access to the dataset. These tools are found under the “Edit Dataset” menu.

Writing Your Own External Tool

If you plan to write a external tool, see the Building External Tools section of the API Guide.

If you have an idea for an external tool, please let the Dataverse Project community know by posting about it on the dataverse-community mailing list: https://groups.google.com/forum/#!forum/dataverse-community