Software Architecture Exploration and Validation with jqAssistant, Neo4j and Cypher

December 31st, 2017 by

I have written about other software system analyzing and validation tools before but today I would like to introduce a new tool named jqAssistant that supports software architects, developers and analysts in a variety of tasks like analyzing given structures, validating architectural or quality constraints and generating reports.

Therefore jqAssistant analyzes given projects or artifacts and stores the gathered information – that is enriched by a variety of existing plugin-ins – in a Neo4j graph database.

This graph database may now be used to enforce architectural constraints or specific code metrics, to generate reports or to analyze a system with a nice browser interface.

In this tutorial I’m going to show how to integrate jqAssistant in an existing project using Maven as build-tool, how to explore an existing system step-by-step and finally how to enforce specific metrics by writing them down as a kind of living documentation in an AsciiDoc document.

I have also started to write down basic Cypher (the query language used) queries to analyze a system and other queries to gather basic metrics.

Structure analysis with embedded Neo4j

Structure analysis with embedded Neo4j

 

Integration with Maven

Using Maven we just need to add one plugin to our project’s pom.xml:

<plugin>
  <groupId>com.buschmais.jqassistant</groupId>
  <artifactId>jqassistant-maven-plugin</artifactId>
  <executions>
    <execution>
      <goals>
        <goal>scan</goal>
        <goal>analyze</goal>
      </goals>
      <configuration>
        <warnOnSeverity>MINOR</warnOnSeverity>
        <failOnSeverity>MAJOR</failOnSeverity>
      </configuration>
    </execution>
  </executions>
</plugin>

It is also possible to use Gradle or run an executable Jar file instead but I will be using Maven for the following examples.

Basic Operations

With the Maven plugin we’re able to run the following operations for our project..

Scanning Structures

The following command scans our project for different structures (classes, relations, dependencies, artifacts etc.) and is enhanced by different plugins adding additional information to the structure database e.g. by adding labels.

Common plugins are:

  • CDI
  • Common / Common Test
  • Core Analysis
  • Java EE 6 and EJB3
  • GraphML
  • JAX-RS (REST)
  • JPA 2 (ORM) and RDBMS
  • JSON, XML and YAML
  • JUnit and TestNG
  • Java and Java 8
  • Maven 3
  • OSGi and Tycho

To scan our project we simply need to run the following command (increase memory for bigger projects!):

mvn jqassistant:scan

This creates complete databases and indexes in the directory target/jqassistant/store.

Increasing memory may be controlled with the following environment variables ..

export JQASSISTANT_OPTS=-Xmx2048M
export MAVEN_OPTS=-Xmx2048M

Checking Rules

We may enforce architectural constraints by defining rules with Cypher queries embedded in XML or AsciiDoc.

As jqAssistant is capable of reading rules from AsciiDoc, we may use it as a tool for living documentation – there is a nice example available for this, the “jqAssistant Spring Pet Clinic Demo Output“.

We’re now writing a sample constraint that forbids classes in the package com.hascode.control to access classes in the package com.hascode.boundary (assuming some layer contract).

In the Neo4j browser, the query for our sample project looks like this one:

Defining architectural constraints as Cypher query

Defining architectural constraints as Cypher query

We’re writing this constraint in AsciiDoc in a file named myrules.adoc in a directory named jqassistant in our project directory.

What we’re doing here .. we’re

  • creating a new group named default
  • this group includes our constraint named myrules:LayerAccessConstraint and specifies a severity = blocker
  • creating a constraint named myrules:LayerAccessConstraint and a severity = blocker
  • this constraint is defined by our following cypher query .. if the query returns something, it fails..
[[default]]
.A collection of architectural constraints.
[role=group,severity=blocker,includesConstraints="myrules:LayerAccessConstraint"]
== Architectural Constraints
 
[[myrules:LayerAccessConstraint]]
.Following the layer contract, the control layer may not access the boundary layer.
[source,cypher,role=constraint,severity=blocker]
----
MATCH (control:Class)-[r*]->(boundary:Class)
WHERE control.fqn STARTS WITH "com.hascode.control"
AND boundary.fqn STARTS WITH "com.hascode.boundary"
RETURN boundary
----

The advantage of using AsciiDoc is that we’re creating something readable – this is what my AsciiDoc editor in IntelliJ looks like:

IntelliJ AsciiDoc Editor

IntelliJ AsciiDoc Editor

A detailed explanation of the rules syntax can be found in the jqAssistant manual here.

This is the directory structure of our sample project where the constraint above is violated because ControlClass from the control layer accesses BoundaryClass from the boundary layer.

.
├── jqassistant
│   └── myrules.adoc
├── pom.xml
└── src
    ├── main
    │   ├── java
    │   │   └── com
    │   │       └── hascode
    │   │           ├── boundary
    │   │           │   └── BoundaryClass.java
    │   │           └── control
    │   │               └── ControlClass.java
    │   └── resources
    └── test
        └── java

The following goal triggers the validation of these rules

$ mvn jqassistant:scan jqassistant:analyze
[..]
[INFO] Executing analysis for 'jqassistant-tutorial'.
[INFO] Reading rules from directory /data/project/jqassistant-tutorial/jqassistant
[INFO] Executing group 'default'
[INFO] Validating constraint 'myrules:LayerAccessConstraint' with severity: 'BLOCKER'.
[ERROR] --[ Constraint Violation ]-----------------------------------------
[ERROR] Constraint: myrules:LayerAccessConstraint
[ERROR] Severity: BLOCKER
[ERROR] Following the layer contract, the control layer may not access the boundary layer.
[ERROR]   boundary=com.hascode.boundary.BoundaryClass
[ERROR]   boundary=com.hascode.boundary.BoundaryClass
[ERROR]   boundary=com.hascode.boundary.BoundaryClass
[ERROR] -------------------------------------------------------------------
[ERROR] 
 
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 8.808 s
[INFO] Finished at: 2017-12-29T16:22:30+01:00
[INFO] Final Memory: 168M/1121M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal com.buschmais.jqassistant:jqassistant-maven-plugin:1.3.0:analyze (default-cli) on project jqassistant-tutorial: Violations detected: 0 concepts, 1 constraints -> [Help 1]

As expected, our build fails and reports an error because our constraint was violated.

The following Bitbucket Pipeline Build demonstrates this result, too.

Showing existing and effective Rules

The following command enlists all known rules in the current environment:

$ mvn jqassistant:available-rules
[..]
[INFO] Available rules for 'jqassistant-tutorial'.
[INFO] Reading rules from directory /data/project/jqassistant-tutorial/jqassistant
[INFO] Groups [4]
[INFO]   "ejb3:EJB"
[INFO]   "junit4:Default"
[INFO]   "myrules:ArchitectureConstraintGroup"
[INFO]   "testng:TestNG"
[INFO] Constraints [12]
[INFO]   "cdi:BeansMustNotUseFieldInjection" - CDI beans shall not use field injection (constructor and setter injections are fine.).
[INFO]   "cdi:BeansMustUseConstructorInjection" - All CDI beans must use constructor injection.
[INFO]   "dependency:ArtifactCycles" - There must be no cyclic artifact dependencies.
[INFO]   "dependency:PackageCycles" - There must be no cyclic package dependencies.
[INFO]   "jpa2:ValidationModeMustBeExplicitlySpecified" - The validation mode of all persistence units must be explicitly specified and either set to CALLBACK or NONE.
[INFO]   "junit4:AssertionMustProvideMessage" - All assertions must provide a message.
[INFO]   "junit4:IgnoreWithoutMessage" - All @Ignore annotations must provide a message.
[INFO]   "junit4:TestMethodWithoutAssertion" - All test methods must perform assertions (within a call hierarchy of max. 3 steps).
[INFO]   "maven3:HierarchicalParentModuleRelation" - If a parent Maven project declares a module then the parent project must also be declared as the
            parent of the module (i.e. to keep the project hierarchy consistent).
 
[INFO]   "myrules:LayerAccessConstraint" - Following the layer constract, the control layer may not access the boundary layer.
[INFO]   "osgi-bundle:InternalTypeMustNotBePublic" - Internal types must not be public if no depending types exist in other packages of the bundle.
[..]

To see which rules are applied to the current project, we may use the following command:

mvn jqassistant:effective-rules

This command should enlist our custom groups and constraints, too!

Running Neo4j Server

To start an embedded instance of Neo4j we simply need to run the following command:

mvn jqassistant:server

Afterwards we may access and explore our graph database by pointing our browser to this location: http://localhost:7474/

I have added an example for exploring existing structures with the Neo4j browser at the end of this tutorial, please feel free to skip to this section.

Cypher Queries for Structure Analysis

The following queries are helpful to explore our architecture and code structures, examples for exploring a concrete system step by step can be found at the end of this tutorial here.

Show existing labels

The following query shows existing labels. When exploring our project sources, nodes labeled with Class, Type are often a good entry point for analyzing relations or aggregating code metrics.

MATCH (n)
WITH DISTINCT labels(n) AS labels
UNWIND labels AS label
RETURN DISTINCT label
ORDER BY label
Selecting existing labels

Selecting existing labels

Existing Labels of a sample project:

  • Annotation
  • Array
  • Artifact
  • Attribute
  • Class
  • Configuration
  • Constructor
  • Container
  • Contributor
  • Developer
  • Directory
  • Document
  • Element
  • Enum
  • ExecutionGoal
  • Field
  • File
  • Interface
  • Java
  • Json
  • Key
  • License
  • Manifest
  • ManifestEntry
  • ManifestSection
  • Maven
  • Member
  • Method
  • Namespace
  • Object
  • Organization
  • Package
  • Parameter
  • Participant
  • Plugin
  • PluginExecution
  • Pom
  • Primitive
  • Profile
  • Project
  • Properties
  • Property
  • Repository
  • Role
  • Scalar
  • ServiceLoader
  • Text
  • Type
  • Value
  • Xml

Display Relation Types for Classes

This query allows us to explore existing relation types for labeled nodes .. in this example we’re fetching all relation types for nodes labeled as Class.

MATCH (c:Class)-[r]->()
RETURN DISTINCT TYPE(r)
Select existing relation types

Select existing relation types

Existing relations for classes in a sample project:

  • EXTENDS
  • ANNOTATED_BY
  • DECLARES
  • DEPENDS_ON
  • IMPLEMENTS
  • IS

Unique combination of relationship and labels

Showing a combination of labels and relationship types.

MATCH (n)-[r]-()
RETURN DISTINCT LABELS(n), TYPE(r)
Show relationships to labels

Show relationships to labels

Display Indexes and Constraints

The following queries shows existing indexes and constraints. This might be only relevant when we need to rewrite a query for performance.

:schema
Show indexes and constraints

Show indexes and constraints

Indexes in a sample project:

Indexes
ON :Artifact(fqn)            ONLINE
ON :Class(fqn)               ONLINE
ON :ColumnType(databaseType) ONLINE
ON :Concept(id)              ONLINE
ON :File(fileName)           ONLINE
ON :Package(fqn)             ONLINE
ON :Pom(fqn)                 ONLINE
ON :Project(fqn)             ONLINE
ON :Repository(url)          ONLINE
ON :Type(fqn)                ONLINE
ON :Value(value)             ONLINE
No constraints

Display Properties for Labeled Nodes

This query explores properties for nodes of a given label – in this example for nodes labeled with Type.

MATCH (t:TYPE)
WITH DISTINCT(KEYS(t)) AS KEYS
UNWIND KEYS AS KEY
RETURN DISTINCT(KEY) AS KEY
ORDER BY KEY ASC
Displaying node properties

Displaying node properties

Cypher Queries for Quality Metrics

Having explored our code structures and components we may now analyze weak points, code smells and possible hot-spots for refactoring.

There are plenty of possible metrics, some well-known basic ones like this (an explanation of these metrics can be found here).

  • Abstractness
  • Afferent Coupling
  • Comment Lines of Code
  • Cyclic Dependencies
  • Cyclomatic Complexity
  • Density of Comments
  • Depth in Tree
  • Direct Cyclic Dependencies
  • Distance from the Main Sequence
  • Efferent Coupling
  • Encapsulation Principle
  • Executable Statements
  • Instability
  • Limited Size Principle
  • Modularization Quality
  • Non-Comment Lines of Code
  • Number of Abstract Types
  • Number of Children in Tree
  • Number of Concrete Types
  • Number of Exported Types
  • Number of Parameters
  • Number of Types
  • Response for Class
  • Total Lines of Code
  • Weighted Methods per Class
  • Number Of Fields
  • Number Of Attributes

The following single queries may be combined to aggregate new code or quality metrics.

Class with highest number of methods within a specific package

This query allows us to explore a given package and fetch classes with a high number of methods (possible code smell).

MATCH (class:Class)-[:DECLARES]->(method:Method)
WHERE class.fqn =~ "com.hascode.*"
RETURN
class.fqn, COUNT(method) AS Methods
ORDER BY
Methods DESC
Display classes with highest amount of methods

Display classes with highest amount of methods

Class with deepest inheritance hierarchy

The deeper a classes inheritance level, the harder it is to predict its behavior because of the inherited methods, deeper trees involve greater design complexity (Chidamber, S. R. & Kemerer, C. F).

Therefore the following query shows us the classes with the deepest inheritance hierarchy in a given package.

MATCH p=(class:Class)-[:EXTENDS*]->(super:TYPE)
WHERE class.fqn STARTS WITH "com.hascode"
RETURN class.fqn, LENGTH(p) AS Depth
ORDER BY Depth DESC
LIMIT 20
Display classes with deepest inheritance hierarchy

Display classes with deepest inheritance hierarchy

Amount of relations between classes

This query returns the amount of relation between classes in a given package.

MATCH (c:Class)-[r]-()
WHERE c.fqn =~ "com.hascode.*"
RETURN c.fqn, COUNT(r) AS relations
ORDER BY relations DESC
Show amount of relations between classes

Show amount of relations between classes

Amount of own/foreign classes

This query fetches the ratio of own and foreign classes.

MATCH (c:Class)
MATCH (o:Class)
WHERE c.fqn STARTS WITH "com.hascode." AND NOT o.fqn STARTS WITH "com.hascode."
RETURN COUNT(DISTINCT(c)) AS amount_own, COUNT(DISTINCT(o)) AS amount_foreign
Show amount of own vs foreign classes (efferent coupling)

Show amount of own vs foreign classes (efferent coupling)

Find Ignored Tests

Find ignored tests either using the special label from the JUnit plugin (assuming JUni is used) using something like MATCH (method:Method:Ignore) RETURN method or this approach:

MATCH (c:Class)-[:DEPENDS_ON*]->(a:TYPE)
WHERE a.fqn = "org.junit.Ignore"
RETURN c
Finding ignored tests

Finding ignored tests

Find Long Method Names

Finds the methods with the longest names

MATCH (c:Class)-[:DECLARES]->(m:Method)
WHERE c.fqn STARTS WITH "com.hascode"
AND m.name IS NOT NULL
RETURN c.fqn, m.name, SIZE(m.name) AS nameSize
ORDER BY nameSize DESC
LIMIT 20
Find methods with long names

Find methods with long names

Find Classes with many constructors

Returns a list of the classes with many constructors..

MATCH (class:Class)-[coRef:DECLARES]->(co:Constructor)
WHERE class.fqn STARTS WITH "com.hascode"
RETURN class.fqn, COUNT(coRef) AS constructors
ORDER BY constructors DESC
LIMIT 20
Find classes with many constructors

Find classes with many constructors

Find Methods with many parameters

Too many parameters for a method might be a code smell..

MATCH (class:Class)-[:DECLARES]->(m:Method)-[params:HAS]->(p:Parameter)
WHERE class.fqn STARTS WITH "com.hascode"
RETURN class.fqn, m.name, COUNT(params) AS paramCount
ORDER BY paramCount DESC
LIMIT 20
Find methods with many parameters

Find methods with many parameters

Find Constructors with many parameters

Same as above but now only for constructors..

MATCH (class:Class)-[:DECLARES]->(co:Constructor)-[params:HAS]->(p:Parameter)
WHERE class.fqn STARTS WITH "com.hascode"
RETURN class.fqn, COUNT(params) AS paramCount
ORDER BY paramCount DESC
LIMIT 20
Find constructors with many parameters

Find constructors with many parameters

Find SPI and Service Implementations

Finds service implementations using the service-loader mechanism.

Might be useful to scan if important SPIs are implemented.

MATCH (impl:Class)<-[REF:CONTAINS]-(loader:ServiceLoader)-[:OF_TYPE]->(t:TYPE)
RETURN t.fqn AS Loaders, impl.fqn AS Service
Finding service loaders and services

Finding service loaders and services

Count non-abstract or non-interface types in package

This query counts non-abstract types in a package. This may be used for other queries e.g. to calculate the abstractness of a component.

MATCH (t:TYPE)
WHERE t.fqn STARTS WITH "com.hascode"
AND (t.abstract IS NULL
OR NOT t:Interface)
RETURN COUNT(t) AS nonabstract
Count non-abstract or interface types in a package

Count non-abstract or non-interface types in a package

Count abstract and interface types in package

This query counts abstract types in a package. This may be used for other queries e.g. to calculate the abstractness of a component.

MATCH (t:TYPE)
WHERE t.fqn STARTS WITH "com.hascode"
AND (t.abstract = TRUE
OR t:Interface)
RETURN COUNT(t) AS abstract
Count abstract and interface types in a package

Count abstract and interface types in a package

Abstractness of a Package

Combining both metrics above we may now calculate the abstractness of a package using the following query:

MATCH (t:Type)
WHERE t.fqn STARTS WITH "com.hascode"
AND (t.abstract IS NULL
OR NOT t:Interface)
WITH COUNT(t) AS nonabstract
MATCH (t:Type)
WHERE t.fqn STARTS WITH "com.hascode"
AND (t.abstract = TRUE
OR t:Interface)
WITH nonabstract, COUNT(t) AS abstract
RETURN toFloat(abstract) / toFloat(nonabstract) AS abstractness
Abstractness for a package

Abstractness for a package

More to come, if you have ideas, query improvements or new queries, please feel free to post them as a comment – I’ll be happy to include them here! :)

Tutorial Sources

Please feel free to download the tutorial sources from my Bitbucket repository, fork it there or clone it using Git:

git clone https://bitbucket.org/hascode/jqassistant-tutorial.git

Resources

Partial Alternative: ArchUnit

For validating architectural constraints there is another library that might be of interest: ArchUnit.

I have written a short how-to/tutorial about this library here: “Assuring Architectural Rules with ArchUnit“.

Other Neo4j Tutorials

If you’re interested in Neo4j, please feel free take a look at these articles of mine:

Appendix A: Exemplary Exploration of a Library

A possible option to explore a system is to start at a well-known class and dig through its dependencies, fields and references.

Entry Point: Class

For this example, I’m choosing the class DslProcessorImpl from a library of mine as a starting point so this is our first simple query to get an entry point:

MATCH (c:Class)
WHERE n.fqn ENDS WITH "DslProcessorImpl"
RETURN n

This result looks like this in the Neo4j browser:

Selecting a starting point: Single class

Selecting a starting point: Single class

As expected we see a single class – an in addition to that we can see the additional labels that the selected node possesses: Class, File, Java and Type.

We may now explore the relations of this class by double-clicking on the node in the graphical view – this leads to the following view:

Exploring a class relations

Exploring a class relations

We see a lot of additional nodes as well as a list of named relations like CONTAINS, DEPENDS_ON, DECLARES and many others.

With this information we may create additional queries to analyze the system.

Given the relation DECLARES allows us to fetch the fields that our class declares:

MATCH (c:Class)-[:DECLARES]->(f:FIELD)
WHERE c.fqn ENDS WITH "DslProcessorImpl"
RETURN f AS FIELDS
Exploring a classes declared fields

Exploring a classes declared fields

Switching between the graph view and the table view is helpful depending on the output of our queries in the Neo4j browser.

Using another named relation, DEPENDS, we may explore dependencies of our class:

MATCH (c:Class)-[:DEPENDS_ON]->(dependency)
WHERE c.fqn ENDS WITH "DslProcessorImpl"
RETURN dependency
ORDER BY dependency.fqn ASC
Exploring dependencies of a given class

Exploring dependencies of a given class

Or we might take a look at interfaces that our class implements:

MATCH (c:Class)-[:IMPLEMENTS*]->(super)
WHERE c.fqn ENDS WITH "DslProcessorImpl"
RETURN super
Exploring interfaces implemented

Exploring interfaces implemented

These queries are not that exciting as every IDE is faster here but they may be used to create basic entry points for more complex queries or to create constraints for validation our architectures.

Queries may be joined using WITH, UNION and other operators.

Besides from analyzing classes and their relations we may also explore other labels and pieces of information that is written to our graph database by the different plugins.

Entry Point: Artifact

The following query creates a list of Jar-file artifacts from our sample-project and produces an output in the format “groupid:artifactid:version“.

MATCH (a:Artifact)
WHERE a.TYPE = "jar"
WITH a.name+":"+a.GROUP+":"+a.version AS GAV
WHERE GAV IS NOT NULL
RETURN GAV
ORDER BY GAV ASC
Exploring a project's artifacts

Exploring a project's artifacts

Entry Point: Project

Another interesting entry point for an analysis might be nodes labeled as Project as fetched by the following query:

MATCH (p:Project)
RETURN p
Selecting a project as entry point

Selecting a project as entry point

I have changed the look of the graph view by giving nodes labeled as projects a red background and increasing their size.

Appendix B: jqAssistant Reports

The following Maven goal allows us to produce some HTML output in no time including the results of our validation process (see “Checking Rules”):

mvn jqassistant:report

The result for our sample project looks like this one:

Generated jqAssistant report

Generated jqAssistant report

Appendix C: Remote Neo4j Server with Docker

Sometimes it’s more practical to store our analysis data rather in some remote Neo4j server than in a local store.

As there is an official Neo4j Docker image on DockerHub, it’s really easy to demonstrate the analysis process using a Docker image.

We’re starting our Neo4j server with the following command:

docker run -td --rm -p 7474:7474 -p 7687:7687 neo4j:3.3.2

Afterwards when the server is running, we need to reset the credentials for the user neo4j once using the web admin panel at http://localhost:7474.

In addition we need to add the following configuration to our project’s pom.xml:

<plugin>
	<groupId>com.buschmais.jqassistant</groupId>
	<artifactId>jqassistant-maven-plugin</artifactId>
	<version>1.3.0</version>
	<configuration>
		<store>
			<uri>bolt://localhost:7687</uri>
			<username>neo4j</username>
			<password>neo4j</password>
		</store>
	</configuration>
</plugin>

The console output when running mvn jqassistant:scan again indicates that a remote server is used to store the results:

$ mvn jqassistant:scan
[..]-
[INFO] Loaded jQAssistant plugins [CDI, Common, Common Test, Core Analysis, EJB3, GraphML, JAX-RS, JPA 2, JSON, JUnit, Java, Java 8, Java EE 6, Maven 3, OSGi, RDBMS, TestNG, Tycho, XML, YAML].
[INFO] Connecting to store at 'bolt://localhost:7687' (username=neo4j)
[INFO] Resetting store.
[INFO] Reset finished (removed 0 nodes, 0 relations).
[INFO] Entering /data/project/jqassistant-tutorial/target/classes
[INFO] Leaving /data/project/jqassistant-tutorial/target/classes (6 entries, 63 ms)
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 5.703 s
[INFO] Finished at: 2018-01-28T17:20:28+01:00
[INFO] Final Memory: 23M/398M
[INFO] ------------------------------------------------------------------------
[INFO] Closing store in directory 'bolt://localhost:7687'.

Article Updates

  • 2018-02-19: Metrics for package abstractness added, comments activated.
  • 2018-01-28: Examples for remote Neo4j server and Docker added.
  • 2018-01-02: Queries for long method names, classes with many constructors, methods/constructors with many parameters, and service-loaders and services added

Tags: , , , , , , , , , , , , , , , , , , , , , ,

Search
Categories