Toggle navigation
Dataverse Project
About
About the Project
Add Data
Blog
Presentations
Publications
Community
Best Practices
Academic Credit
Harvard Dataverse Policies
Data Management
Replication Dataset Guidelines
Software
Features
Source Code
User Guide
Installation Guide
API Guide
Developer Guide
Style Guide
Admin Guide
Contact
User Guide
Admin Guide
API Guide
Installation Guide
Developer Guide
Introduction
Development Environment
Windows Development
Tips
Troubleshooting
Version Control
SQL Upgrade Scripts
Testing
Documentation
Dependency Management
Debugging
Coding Style
Deployment
Docker, Kubernetes, and OpenShift
Making Releases
Tools
Universal Numerical Fingerprint (UNF)
Make Data Count
Shibboleth and OAuth
Geospatial Data
SELinux
Big Data Support
Workflows
Style Guide
Developer Guide
ΒΆ
Contents:
Introduction
Intended Audience
Getting Help
Core Technologies
Roadmap
Kanban Board
Issue Tracker
Related Guides
Related Projects
Development Environment
Quick Start
Set Up Dependencies
Supported Operating Systems
Install Java
Install Netbeans or Maven
Install Homebrew (Mac Only)
Clone the Dataverse Git Repo
Build the Dataverse War File
Install jq
Install Glassfish
Test Glassfish Startup Time on Mac
Install PostgreSQL
Install Solr
Run the Dataverse Installer Script
Verify Dataverse is Running
Configure Your Development Environment for Publishing
Next Steps
Windows Development
Running Dataverse in Vagrant
Install Vagrant
Install VirtualBox
Reboot
Install Git
Configure Git to use Unix Line Endings
Clone Git Repo
vagrant up
Running Dataverse in Minishift
Install VirtualBox
Install Git
Install Minishift
Clone Git Repo
Start Minishift VM and Run Dataverse
Improving Windows Support
Windows Subsystem for Linux
Discussion and Feedback
Tips
Iterating on Code and Redeploying
Undeploy the war File from the
install
Script
Add Glassfish 4.1 as a Server in Netbeans
Ensure that Dataverse Will Be Deployed to Glassfish 4.1
Make a Small Change to the Code
Confirm the Change Was Deployed
Netbeans Connector Chrome Extension
Database Schema Exploration
pgAdmin
SchemaSpy
Deploying With
asadmin
Running the Dataverse
install
Script in Non-Interactive Mode
Preventing Glassfish from Phoning Home
Solr
Git
Set Up SSH Keys
Git on Mac
Troubleshooting
context-root in glassfish-web.xml Munged by Netbeans
Configuring / Troubleshooting Mail Host
Rebuilding Your Dev Environment
DataCite
Version Control
Where to Find the Dataverse Code
Branching Strategy
Goals
Branches
The “master” Branch
The “develop” Branch
Feature Branches
How to Make a Pull Request
Find or Create a GitHub Issue
Create a New Branch off the develop Branch
Commit Your Change to Your New Branch
Push Your Branch to GitHub
Make a Pull Request
Make Sure Your Pull Request Has Been Advanced to Code Review
How to Resolve Conflicts in Your Pull Request
Adding Commits to a Pull Request from a Fork
SQL Upgrade Scripts
Location of SQL Upgrade Scripts
How to Determine if You Need to Create a SQL Upgrade Script
How to Create a SQL Upgrade Script
Troubleshooting
Renaming SQL Upgrade Scripts
Testing
The Health of a Codebase
Testing in Depth
Unit Tests
Unit Test Automation Overview
Writing Unit Tests with JUnit
Refactoring Code to Make It Unit-Testable
Parameterized Tests and JUnit Theories
Observing Changes to Code Coverage
Testing Commands
Running Non-Essential (Excluded) Unit Tests
Integration Tests
Running the full API test suite using Docker
Getting Set Up to Run REST Assured Tests
The Burrito Key
Root Dataverse Permissions
Publish Root Dataverse
dataverse.siteUrl
Identifier Generation
Writing Integration Tests with REST Assured
Measuring Coverage of Integration Tests
Load/Performance Testing
Locust
download-files.sh script
The Phoenix Server
How the Phoenix Tests Work
How to Run the Phoenix Tests
List of Tests Run Against the Phoenix Server
Future Work
Future Work on Unit Tests
Future Work on Integration Tests
Browser-Based Testing
Installation Testing
Future Work on Load/Performance Testing
Documentation
Quick Fix
Other Changes (Sphinx)
Installing Sphinx
Using Sphinx
Table of Contents
Images
GraphViz based images
Versions
Dependency Management
Terms
Direct dependencies
Transitive dependencies
Managing transitive dependencies in
pom.xml
Helpful tools
Repositories
Debugging
Logging
Coding Style
Java
Formatting Code
Tabs vs. Spaces
Braces Placement
Format Code You Changed with Netbeans
Checking Your Formatting With Checkstyle
Logging
Avoid Hard-Coding Strings
Type Safety
Bash
Formatting Code
Tabs vs. Spaces
Bike Shedding
Deployment
Deploying Dataverse to Amazon Web Services (AWS)
Install AWS CLI
Troubleshooting “aws: command not found”
Configure AWS CLI
Configure Ansible File (Optional)
Download and Run the “Create Instance” Script
Caveats
Docker, Kubernetes, and OpenShift
OpenShift
Install Minishift
Start Minishift
Make the OpenShift Client Binary (oc) Executable
Log in to Minishift from the Command Line
Create a Minishift Project
Create a Dataverse App within the Minishift Project
Log into Minishift and Visit Dataverse in your Browser
Troubleshooting
Check Status of Dataverse Deployment to Minishift
Review Logs of Dataverse Deployment to Minishift
Get a Shell (ssh/rsh) on Containers Deployed to Minishift
Cleaning up
Making Changes
Making Changes to Docker Images
Making Changes to the OpenShift Config
Making Changes to the PostgreSQL Database from the Glassfish Pod
Scaling Dataverse by Increasing Replicas in a StatefulSet
Configuring Persistent Volumes and Solr master node recovery
Running Containers to Run as Root in Minishift
Minishift Resources
Docker
Installing Docker
All In One Docker Images for Testing
Future production use on Minishift/OpenShift/Kubernetes
Known Issues with Dataverse Images on Docker Hub
Making Releases
Create the release GitHub issue and branch
1. Bump Version Numbers
2. Save the EJB Database Create Script
3. Check in the Changes Above...
Merge “develop” into “master”
Write Release Notes
Make Artifacts Available for Download
Publish Release
Tools
Netbeans Connector Chrome Extension
pgAdmin
Maven
Vagrant
PlantUML
Eclipse Memory Analyzer Tool (MAT)
PageKite
MSV
FontCustom
SonarQube
Infer
lsof
Universal Numerical Fingerprint (UNF)
UNF Version 3
UNF Version 5
UNF Version 6
I. UNF of a Data Vector
II. Combining multiple UNFs to create UNFs of higher-level objects.
Footnotes:
Make Data Count
Architecture
Dev Environment Setup for Make Data Count
Testing Make Data Count and Dataverse
Resources
Shibboleth and OAuth
Shibboleth and OAuth
Geospatial Data
Geoconnect
How Dataverse Ingests Shapefiles
Ingest
Example
WorldMap JoinTargets + API Endpoint
How Geoconnect Uses Join Target Information
Retrieving Join Target Information from WorldMap API
Saving Join Target Information to Geoconnect Database
Setting Up WorldMap Test Data
SELinux
Introduction
Development Environment
Recreating the shibboleth.te File
Ensure that SELinux is Enforcing
Removing the Existing shibboleth.te Rules
Exercising SELinux denials
Stub out the new shibboleth.te file
Iteratively Use audit2allow to Add Rules and Test Your Change
Big Data Support
Data Capture Module (DCM)
Install a DCM
Downloading rsync scripts via Dataverse API
How a DCM reports checksum success or failure to Dataverse
Steps to set up a DCM mock for Development
Troubleshooting
Steps to set up a DCM via Docker for Development
Docker Image Set-up
Optional steps for setting up the S3 Docker DCM Variant
Using the DCM Docker Containers
Additional DCM docker development tips
Repository Storage Abstraction Layer (RSAL)
Steps to set up a DCM via Docker for Development
Using the RSAL Docker Containers
Configuring the RSAL Mock
Configuring download via rsync
Workflows
Introduction
Administration
Available Steps
log
pause
http/sr
archiver