External tools can provide additional features that are not part of Dataverse itself, such as data exploration. Thank you for your interest in building an external tool for Dataverse!
Contents:
You can think of a external tool as a glorified hyperlink that opens a browser window in a new tab on some other website. The term “external” is used to indicate that the user has left the Dataverse web interface. For example, perhaps the user is looking at a dataset on https://demo.dataverse.org . They click “Explore” and are brought to https://fabulousfiletool.com?fileId=42&siteUrl=http://demo.dataverse.org
The “other website” (fabulousfiletool.com in the example above) is probably part of the same ecosystem of scholarly publishing that Dataverse itself participates in. Sometimes the other website runs entirely in the browser. Sometimes the other website is a full blown server side web application like Dataverse itself.
The possibilities for external tools are endless. Let’s look at some examples to get your creative juices flowing.
Note: This is the same list that appears in the External Tools section of the Admin Guide.
Tool | Type | Scope | Description |
---|---|---|---|
TwoRavens | explore | file | A system of interlocking statistical tools for data exploration, analysis, and meta-analysis: http://2ra.vn. See the TwoRavens: Tabular Data Exploration section of the User Guide for more information on TwoRavens from the user perspective and the TwoRavens section of the Installation Guide. |
Data Explorer | explore | file | A GUI which lists the variables in a tabular data file allowing searching, charting and cross tabulation analysis. See the README.md file at https://github.com/scholarsportal/Dataverse-Data-Explorer for the instructions on adding Data Explorer to your Dataverse; and the Prerequisites section of the Installation Guide for the instructions on how to set up basic R configuration required (specifically, Dataverse uses R to generate .prep metadata files that are needed to run Data Explorer). |
Whole Tale | explore | dataset | A platform for the creation of reproducible research packages that allows users to launch containerized interactive analysis environments based on popular tools such as Jupyter and RStudio. Using this integration, Dataverse users can launch Jupyter and RStudio environments to analyze published datasets. For more information, see the Whole Tale User Guide. |
File Previewers | explore | file | A set of tools that display the content of files - including audio, html, Hypothes.is annotations, images, PDF, text, video, tabular data, and spreadsheets - allowing them to be viewed without downloading. The previewers can be run directly from github.io, so the only required step is using the Dataverse API to register the ones you want to use. Documentation, including how to optionally brand the previewers, and an invitation to contribute through github are in the README.md file. Initial development was led by the Qualitative Data Repository and the spreasdheet previewer was added by the Social Sciences and Humanities Open Cloud (SSHOC) project. https://github.com/GlobalDataverseCommunityConsortium/dataverse-previewers |
Data Curation Tool | configure | file | A GUI for curating data by adding labels, groups, weights and other details to assist with informed reuse. See the README.md file at https://github.com/scholarsportal/Dataverse-Data-Curation-Tool for the installation instructions. |
An external tool can appear in Dataverse in one of three ways:
See also the Testing External Tools section of the Admin Guide for some perspective on how installations of Dataverse will expect to test your tool before announcing it to their users.
External tools must be expressed in an external tool manifest file, a specific JSON format Dataverse requires. As the author of an external tool, you are expected to provide this JSON file and installation instructions on a web page for your tool.
Let’s look at two examples of external tool manifests (one at the file level and one at the dataset level) before we dive into how they work.
fabulousFileTool.json
is a file level explore tool that operates on tabular files:
{
"displayName": "Fabulous File Tool",
"description": "Fabulous Fun for Files!",
"scope": "file",
"type": "explore",
"toolUrl": "https://fabulousfiletool.com",
"contentType": "text/tab-separated-values",
"toolParameters": {
"queryParameters": [
{
"fileid": "{fileId}"
},
{
"key": "{apiToken}"
}
]
}
}
dynamicDatasetTool.json
is a dataset level explore tool:
{
"displayName": "Dynamic Dataset Tool",
"description": "Dazzles! Dizzying!",
"scope": "dataset",
"type": "explore",
"toolUrl": "https://dynamicdatasettool.com/v2",
"toolParameters": {
"queryParameters": [
{
"PID": "{datasetPid}"
},
{
"apiToken": "{apiToken}"
}
]
}
}
Term | Definition |
---|---|
external tool manifest | A JSON file the defines the URL constructed by Dataverse when users click “Explore” or “Configure” buttons. External tool makers are asked to host this JSON file on a website (no app store yet, sorry) and explain how to use install and use the tool. Examples include fabulousFileTool.json and dynamicDatasetTool.json as well as the real world examples above such as Data Explorer. |
displayName | The name of the tool in the Dataverse web interface. For example, “Data Explorer”. |
description | The description of the tool, which appears in a popup (for configure tools only) so the user who clicked the tool can learn about the tool before being redirected the tool in a new tab in their browser. HTML is supported. |
scope | Whether the external tool appears and operates at the file level or the dataset level. Note that a file level tool much also specify the type of file it operates on (see “contentType” below). |
type | Whether the external tool is an explore tool or a configure tool. Configure tools require an API token because they make changes to data files (files within datasets). Configure tools are currently not supported at the dataset level (no “Configure” button appears in the GUI for datasets). |
toolUrl | The base URL of the tool before query parameters are added. |
hasPreviewMode | A boolean that indicates whether tool has a preview mode which can be embedded in the File Page. Since this view is designed for embedding within Dataverse, the preview mode for a tool will typically be a view without headers or other options that may be included with a tool that is designed to be launched in a new window. Sometimes, a tool will exist solely to preview files in Dataverse and the preview mode will be the same as the regular view. |
contentType | File level tools operate on a specific file type (content type or MIME type such as “application/pdf”) and this must be specified. Dataset level tools do not use contentType. |
toolParameters | Query parameters are supported and described below. |
queryParameters | Key/value combinations that can be appended to the toolUrl. For example, once substitution takes place (described below) the user may be redirected to https://fabulousfiletool.com?fileId=42&siteUrl=http://demo.dataverse.org . |
query parameter keys | An arbitrary string to associate with a value that is populated with a reserved word (described below). As the author of the tool, you have control over what “key” you would like to be passed to your tool. For example, if you want to have your tool receive and operate on the query parameter “dataverseFileId=42” instead of just “fileId=42”, that’s fine. |
query parameter values | A mechanism for substituting reserved words with dynamic content. For example, in your manifest file, you can use a reserved word (described below) such as {fileId} to pass a file’s database id to your tool in a query parameter. Your tool might receive this query parameter as “fileId=42”. |
reserved words | A set of strings surrounded by curly braces such as {fileId} or {datasetId} that will be inserted into query parameters. See the table below for a complete list. |
Reserved word | Status | Description |
---|---|---|
{siteUrl} |
optional | The URL of the Dataverse installation from which the tool was launched. For example, https://demo.dataverse.org . |
{fileId} |
depends | The database ID of a file the user clicks “Explore” or “Configure” on. For example, 42 . This reserved word is required for file level tools unless you use {filePid} instead. |
{filePid} |
depends | The Persistent ID (DOI or Handle) of a file the user clicks “Explore” or “Configure” on. For example, doi:10.7910/DVN/TJCLKP/3VSTKY . Note that not all installations of Dataverse have Persistent IDs (PIDs) enabled at the file level. This reserved word is required for file level tools unless you use {fileId} instead. |
{apiToken} |
optional | The Dataverse API token of the user launching the external tool, if available. Please note that API tokens should be treated with the same care as a password. For example, f3465b0c-f830-4bc7-879f-06c0745a5a5c . |
{datasetId} |
depends | The database ID of the dataset. For example, 42 . This reseved word is required for dataset level tools unless you use {datasetPid} instead. |
{datasetPid} |
depends | The Persistent ID (DOI or Handle) of the dataset. For example, doi:10.7910/DVN/TJCLKP . This reseved word is required for dataset level tools unless you use {datasetId} instead. |
{datasetVersion} |
optional | The friendly version number ( or :draft ) of the dataset version the file level tool is being launched from. For example, 1.0 or :draft . |
{localeCode} |
optional | The code for the language (“en” for English, “fr” for French, etc.) that user has selected from the language toggle in Dataverse. See also Internationalization. |
Again, you can use fabulousFileTool.json
or dynamicDatasetTool.json
as a starting point for your own manifest file.
As the author of an external tool, you are not expected to learn how to install and operate Dataverse. There’s a very good chance your tool can be added to a server Dataverse developers use for testing if you reach out on any of the channels listed under Getting Help in the Developer Guide.
By all means, if you’d like to install Dataverse yourself, a number of developer-centric options are available. For example, there’s a script to spin up Dataverse on EC2 at https://github.com/IQSS/dataverse-sample-data . The process for using curl to add your external tool to a Dataverse installation is documented under Managing External Tools in the Admin Guide.
Once you’ve gotten your tool working, please make a pull request to update the list of tools above! You are also welcome to download dataverse-external-tools.tsv
, add your tool to the TSV file, create and issue at https://github.com/IQSS/dataverse/issues , and then upload your TSV file there.
Unless your tool runs entirely in a browser, you may have integrated server-side software with Dataverse. If so, please double check that your software is listed in the Integrations section of the Admin Guide and if not, please open an issue or pull request to add it. Thanks!
If you’ve thought to yourself that there ought to be an app store for Dataverse external tools, you’re not alone. Please see https://github.com/IQSS/dataverse/issues/5688 :)
https://demo.dataverse.org is the place to play around with Dataverse and your tool can be included. Please email support@dataverse.org to start the conversation about adding your tool. Additionally, you are welcome to open an issue at https://github.com/IQSS/dataverse-ansible which already includes a number of the tools listed above.
You are welcome to announce your external tool at https://groups.google.com/forum/#!forum/dataverse-community
If you’re too shy, we’ll do it for you. We’ll probably tweet about it too. Thank you for your contribution to Dataverse!