Toggle navigation
Dataverse Project
About
About the Project
Add Data
Blog
Presentations
Publications
Community
Best Practices
Academic Credit
Harvard Dataverse Policies
Data Management
Replication Dataset Guidelines
Software
Features
Source Code
User Guide
Installation Guide
API Guide
Developer Guide
Style Guide
Admin Guide
Contact
User Guide
Admin Guide
API Guide
Installation Guide
Developer Guide
Introduction
Development Environment
Windows Development
Tips
Troubleshooting
Version Control
SQL Upgrade Scripts
Testing
Writing Documentation
Dependency Management
Debugging
Coding Style
Consuming Configuration
Deployment
Docker, Kubernetes, and Containers
Making Releases
Tools
Universal Numerical Fingerprint (UNF)
Make Data Count
Shibboleth and OAuth
Geospatial Data
SELinux
Big Data Support
Auxiliary File Support
Direct DataFile Upload API
Workflows
Style Guide
Developer Guide
ΒΆ
Contents:
Introduction
Intended Audience
Getting Help
Core Technologies
Roadmap
Kanban Board
Issue Tracker
Related Guides
Related Projects
Development Environment
Quick Start
Set Up Dependencies
Supported Operating Systems
Install Java
Install Netbeans or Maven
Install Homebrew (Mac Only)
Clone the Dataverse Software Git Repo
Build the Dataverse Software War File
Install jq
Install Payara
Install PostgreSQL
Install Solr
Run the Dataverse Software Installer Script
Verify the Dataverse Software is Running
Configure Your Development Environment for Publishing
Next Steps
Windows Development
Running the Dataverse Software in Vagrant
Install Vagrant
Install VirtualBox
Reboot
Install Git
Configure Git to use Unix Line Endings
Clone Git Repo
vagrant up
Improving Windows Support
Windows Subsystem for Linux
Discussion and Feedback
Tips
Iterating on Code and Redeploying
Undeploy the war File from the Dataverse Software Installation Script
Add Payara as a Server in Netbeans
Ensure that the Dataverse Software Will Be Deployed to Payara
Make a Small Change to the Code
Confirm the Change Was Deployed
Netbeans Connector Chrome Extension
Database Schema Exploration
pgAdmin
SchemaSpy
Deploying With
asadmin
Running the Dataverse Software Installation Script in Non-Interactive Mode
Preventing Payara from Phoning Home
Solr
Git
Set Up SSH Keys
Git on Mac
Automation of Custom Build Number on Webpage
Sample Data
Switching from Glassfish to Payara
Troubleshooting
context-root in glassfish-web.xml Munged by Netbeans
Configuring / Troubleshooting Mail Host
Rebuilding Your Dev Environment
DataCite
Version Control
Where to Find the Dataverse Software Code
Branching Strategy
Goals
Branches
The “master” Branch
The “develop” Branch
Feature Branches
How to Make a Pull Request
Find or Create a GitHub Issue
Create a New Branch off the develop Branch
Commit Your Change to Your New Branch
Push Your Branch to GitHub
Make a Pull Request
Make Sure Your Pull Request Has Been Advanced to Code Review
Summary of Git commands
How to Resolve Conflicts in Your Pull Request
Adding Commits to a Pull Request from a Fork
SQL Upgrade Scripts
Location of SQL Upgrade Scripts
How to Determine if You Need to Create a SQL Upgrade Script
How to Create a SQL Upgrade Script
Troubleshooting
Renaming SQL Upgrade Scripts
Testing
The Health of a Codebase
Testing in Depth
Unit Tests
Unit Test Automation Overview
Writing Unit Tests with JUnit
Refactoring Code to Make It Unit-Testable
Parameterized Tests and JUnit Theories
Observing Changes to Code Coverage
Testing Commands
Running Non-Essential (Excluded) Unit Tests
Integration Tests
Running the Full API Test Suite Using EC2
Running the full API test suite using Docker
Getting Set Up to Run REST Assured Tests
The Burrito Key
Root Dataverse Collection Permissions
Publish Root Dataverse Collection
dataverse.siteUrl
Identifier Generation
Writing API Tests with REST Assured
Writing and Using a Testcontainers Test
Measuring Coverage of Integration Tests
Add jacocoagent.jar to Payara
Add jacococli.jar to the WAR File
Deploy the Instrumented WAR File
Run Integration Tests
Create Code Coverage Report
Read Code Coverage Report
Load/Performance Testing
Locust
download-files.sh script
Continuous Integration
Enhance build time by caching dependencies
The Phoenix Server
How the Phoenix Tests Work
How to Run the Phoenix Tests
List of Tests Run Against the Phoenix Server
Accessibility Testing
Accessibility Policy
Accessibility Tools
Future Work
Future Work on Unit Tests
Future Work on Integration Tests
Browser-Based Testing
Installation Testing
Future Work on Load/Performance Testing
Future Work on Accessibility Testing
Writing Documentation
Quick Fix
Other Changes (Sphinx)
Installing Sphinx
Using Sphinx
Table of Contents
Images
GraphViz based images
Versions
Dependency Management
Terms
Direct dependencies
Transitive dependencies
Managing transitive dependencies in
pom.xml
Helpful tools
Repositories
Debugging
Logging
Coding Style
Java
Formatting Code
Tabs vs. Spaces
Braces Placement
Format Code You Changed with Netbeans
Checking Your Formatting With Checkstyle
Logging
Avoid Hard-Coding Strings (Use Constants)
Avoid Hard-Coding User-Facing Messaging in English
Type Safety
Bash
Formatting Code
Tabs vs. Spaces
Bike Shedding
Consuming Configuration
Simple Configuration Options
Complex Configuration Options
Why should I care about MicroProfile Config API?
Adopting MicroProfile Config API
Moving or Replacing a JVM Setting
Aliasing Database Setting
Deployment
Deploying the Dataverse Software to Amazon Web Services (AWS)
Install AWS CLI
Troubleshooting “aws: command not found”
Configure AWS CLI
Configure Ansible File (Optional)
Download and Run the “Create Instance” Script
Caveat Recipiens
Migrating Datafiles from Local Storage to S3
To Update Dataset Location to S3, Assuming a
file://
Prefix
To Update Datafile Location to your-s3-bucket, Assuming a
file://
Prefix
To Update Datafile Location to your-s3-bucket, Assuming no
file://
Prefix
Docker, Kubernetes, and Containers
Making Releases
Create the release GitHub issue and branch
1. Bump Version Numbers
2. Check in the Changes Above...
Merge “develop” into “master”
Write Release Notes
Make Artifacts Available for Download
Publish Release
Tools
Netbeans Connector Chrome Extension
pgAdmin
Maven
Vagrant
PlantUML
Eclipse Memory Analyzer Tool (MAT)
PageKite
MSV
FontCustom
SonarQube
Infer
lsof
jmap and jstat
Universal Numerical Fingerprint (UNF)
UNF Version 3
UNF Version 5
UNF Version 6
I. UNF of a Data Vector
II. Combining multiple UNFs to create UNFs of higher-level objects.
Footnotes:
Make Data Count
Architecture
Dev Environment Setup for Make Data Count
Generate Fake Metrics Only
Full Setup
Testing Make Data Count and Your Dataverse Installation
Resources
Shibboleth and OAuth
Shibboleth and OAuth
Geospatial Data
How The Dataverse Software Ingests Shapefiles
Ingest
Example
SELinux
Introduction
Development Environment
Recreating the shibboleth.te File
Ensure that SELinux is Enforcing
Removing the Existing shibboleth.te Rules
Exercising SELinux denials
Stub out the new shibboleth.te file
Iteratively Use audit2allow to Add Rules and Test Your Change
Big Data Support
S3 Direct Upload and Download
Data Capture Module (DCM)
Install a DCM
Downloading rsync scripts via Your Dataverse Installation’s API
How a DCM reports checksum success or failure to your Dataverse Installation
Steps to set up a DCM mock for Development
Troubleshooting
Steps to set up a DCM via Docker for Development
Docker Image Set-up
Optional steps for setting up the S3 Docker DCM Variant
Using the DCM Docker Containers
Additional DCM docker development tips
Repository Storage Abstraction Layer (RSAL)
Steps to set up a DCM via Docker for Development
Using the RSAL Docker Containers
Configuring the RSAL Mock
Configuring download via rsync
Auxiliary File Support
Adding an Auxiliary File to a Datafile
Downloading an Auxiliary File that belongs to a Datafile
Direct DataFile Upload API
Requesting Direct Upload of a DataFile
Adding the Uploaded file to the Dataset
Workflows
Introduction
Administration
Available Steps
log
pause
pause/message
http/sr
http/authext
archiver