Contents:
Dataverse is (currently) a Java EE 7 based application, that uses a lot of additional libraries for special purposes. This includes features like support for SWORD-API, S3 storage and many others.
Besides the code that glues together the single pieces, any developer needs to describe used dependencies for the
Maven-based build system. As is familiar to any Maven user, this happens inside the “Project Object Model” (POM) living in
pom.xml
at the root of the project repository. Recursive and convergent dependency resolution makes dependency
management with Maven very easy. But sometimes, in projects with many complex dependencies like Dataverse, you have
to help Maven make the right choices.
As a developer, you should familiarize yourself with the following terms:
Within the POM, any direct dependencies reside within the <dependencies>
tag:
<dependencies>
<dependency>
<groupId>org.example</groupId>
<artifactId>example</artifactId>
<version>1.1.0</version>
<scope>compile</scope>
</dependency>
</dependencies>
Anytime you add a <dependency>
, Maven will try to fetch it from defined/configured repositories and use it
within the build lifecycle. You have to define a <version>
, but <scope>
is optional for compile
.
(See Maven docs: Dep. Scope)
During fetching, Maven will analyse all transitive dependencies (see graph above) and, if necessary, fetch those, too. Everything downloaded once is cached locally by default, so nothing needs to be fetched again and again, as long as the dependency definition does not change.
Rules to follow:
Maven is comfortable for developers; it handles recursive resolution, downloading, and adding “dependencies of dependencies”. However, as life is a box of chocolates, you might find yourself in version conflict hell sooner than later without even knowing, but experiencing unintended side effects.
When you look at the graph above, imagine B and TB rely on different versions of TC. How does Maven decide which version it will include? Easy: the dependent version of the nearest version wins:
In this case, version “2.0” will be included. If you know something about semantic versioning, a red alert should ring in your mind right now. How do we know that B is compatible with Z v2.0 when depending on Z v1.0?
Another scenario getting us in trouble: indirect use of transitive dependencies. Imagine the following: we rely on Z in our code, but do not include a direct dependency for it within the POM. Now B is updated and removed its dependency on Z. You definitely don’t want to head down that road.
Follow the rules to be safe:
pom.xml
¶Maven can manage versions of transitive dependencies in four ways:
<version>
tag.
Typically a bad idea, don’t do that.<optional>
or <exclusion>
tags on direct dependencies that request the transitive dependency.
Last resort, you really should avoid this. Not explained or used here.
See Maven docs.<dependencyManagement>
and add a <version>
tag.<dependencyManagement>
and add a <version>
tag. Many bigger and standard use projects provide those, making the POM much less bloated
compared to adding every bit yourself.A reduced example, only showing bits relevant to the above cases and usage of an explicit transitive dep directly:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 | <properties>
<aws.version>1.11.172</aws.version>
<!-- We need to ensure that our choosen version is compatible with every dependency relying on it.
This is manual work and needs testing, but a good investment in stability and up-to-date dependencies. -->
<jackson.version>2.9.6</jackson.version>
<joda.version>2.10.1</joda.version>
</properties>
<!-- Transitive dependencies, bigger library "bill of materials" (BOM) and
versions of dependencies used both directly and transitive are managed here. -->
<dependencyManagement>
<dependencies>
<!-- First example for case 4. Only one part of the SDK (S3) is used and transitive deps
of that are again managed by the upstream BOM. -->
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk-bom</artifactId>
<version>${aws.version}</version>
<type>pom</type>
<scope>import</scope>
</dependency>
<!-- Second example for case 4 and an example for explicit direct usage of a transitive dependency.
Jackson is used by AWS SDK and others, but we also use it in Dataverse. -->
<dependency>
<groupId>com.fasterxml.jackson</groupId>
<artifactId>jackson-bom</artifactId>
<version>${jackson.version}</version>
<scope>import</scope>
<type>pom</type>
</dependency>
<!-- Example for case 3. Joda is not used in Dataverse (as of writing this). -->
<dependency>
<groupId>joda-time</groupId>
<artifactId>joda-time</artifactId>
<version>${joda.version}</version>
</dependency>
</dependencies>
</dependencyManagement>
<!-- Declare any DIRECT dependencies here.
In case the depency is both transitive and direct (e. g. some common lib for logging),
manage the version above and add the direct dependency here WITHOUT version tag, too.
-->
<dependencies>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk-s3</artifactId>
<!-- no version here as managed by BOM above! -->
</dependency>
<!-- Should be refactored and removed once on Java EE 8 -->
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
<!-- no version here as managed above! -->
</dependency>
<!-- Should be refactored and removed once on Java EE 8 -->
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<!-- no version here as managed above! -->
</dependency>
</dependencies>
|
Maven provides some plugins that are of great help to detect possible conflicts and implicit usage.
For implicit usage detection, use mvn dependency:analyze. Examine the output with great care. Sometimes you will see implicit usages that do no harm, especially if you are using bigger SDKs having some kind of core package. This will also report on any direct dependency which is not in use and can be removed from the POM. Again, do this with great caution and double check.
If you want to see the dependencies both direct and transitive in a dependency tree format, use mvn dependency:tree.
This will however not help you with detecting possible version conflicts. For this you need to use the Enforcer Plugin with its built in dependency convergence rule.
Maven receives all dependencies from repositories. Those can be public like Maven Central
and others, but you can also use a private repository on premises or in the cloud. Last but not least, you can use
local repositories, which can live next to your application code (see local_lib
dir within Dataverse codebase).
Repositories are defined within the Dataverse POM like this:
<repositories>
<repository>
<id>central-repo</id>
<name>Central Repository</name>
<url>http://repo1.maven.org/maven2</url>
<layout>default</layout>
</repository>
<repository>
<id>prime-repo</id>
<name>PrimeFaces Maven Repository</name>
<url>http://repository.primefaces.org</url>
<layout>default</layout>
</repository>
<repository>
<id>dvn.private</id>
<name>Local repository for hosting jars not available from network repositories.</name>
<url>file://${project.basedir}/local_lib</url>
</repository>
</repositories>
You can also add repositories to your local Maven settings, see docs.
Typically you will skip the addition of the central repository, but adding it to the POM has the benefit that dependencies are first looked up there (which in theory can speed up downloads). You should keep in mind that repositories are used in the order they appear.
Footnotes
[1] | Modern IDEs import your Maven POM and offer import autocompletion for classes based on direct dependencies in the model. You might end up using legacy or repackaged classes because of a wrong scope. |
[2] | This is going to bite back in modern IDEs when importing classes from transitive dependencies by “autocompletion accident”. |
Previous: Documentation | Next: Debugging