Software Architecture Exploration and Validation with jqAssistant, Neo4j and Cypher
December 31st, 2017 by Micha KopsI have written about other software system analyzing and validation tools before but today I would like to introduce a new tool named jqAssistant that supports software architects, developers and analysts in a variety of tasks like analyzing given structures, validating architectural or quality constraints and generating reports.
Therefore jqAssistant analyzes given projects or artifacts and stores the gathered information – that is enriched by a variety of existing plugin-ins – in a Neo4j graph database.
This graph database may now be used to enforce architectural constraints or specific code metrics, to generate reports or to analyze a system with a nice browser interface.
In this tutorial I’m going to show how to integrate jqAssistant in an existing project using Maven as build-tool, how to explore an existing system step-by-step and finally how to enforce specific metrics by writing them down as a kind of living documentation in an AsciiDoc document.
I have also started to write down basic Cypher (the query language used) queries to analyze a system and other queries to gather basic metrics.
Contents
- Integration with Maven
- Basic Operations
- Cypher Queries for Structure Analysis
-
Cypher Queries for Quality Metrics
- Class with highest number of methods within a specific package
- Class with deepest inheritance hierarchy
- Amount of relations between classes
- Amount of own/foreign classes
- Find Ignored Tests
- Find Long Method Names
- Find Classes with many constructors
- Find Methods with many parameters
- Find Constructors with many parameters
- Find SPI and Service Implementations
- Count non-abstract or non-interface types in package
- Count abstract and interface types in package
- Abstractness of a Package
- Tutorial Sources
- Resources
- Partial Alternative: ArchUnit
- Other Neo4j Tutorials
- Appendix A: Exemplary Exploration of a Library
- Appendix B: jqAssistant Reports
- Appendix C: Remote Neo4j Server with Docker
- Article Updates
Integration with Maven
Using Maven we just need to add one plugin to our project’s pom.xml:
<plugin> <groupId>com.buschmais.jqassistant</groupId> <artifactId>jqassistant-maven-plugin</artifactId> <executions> <execution> <goals> <goal>scan</goal> <goal>analyze</goal> </goals> <configuration> <warnOnSeverity>MINOR</warnOnSeverity> <failOnSeverity>MAJOR</failOnSeverity> </configuration> </execution> </executions> </plugin>
It is also possible to use Gradle or run an executable Jar file instead but I will be using Maven for the following examples.
Basic Operations
With the Maven plugin we’re able to run the following operations for our project..
Scanning Structures
The following command scans our project for different structures (classes, relations, dependencies, artifacts etc.) and is enhanced by different plugins adding additional information to the structure database e.g. by adding labels.
Common plugins are:
- CDI
- Common / Common Test
- Core Analysis
- Java EE 6 and EJB3
- GraphML
- JAX-RS (REST)
- JPA 2 (ORM) and RDBMS
- JSON, XML and YAML
- JUnit and TestNG
- Java and Java 8
- Maven 3
- OSGi and Tycho
To scan our project we simply need to run the following command (increase memory for bigger projects!):
mvn jqassistant:scan
This creates complete databases and indexes in the directory target/jqassistant/store.
Increasing memory may be controlled with the following environment variables ..
export JQASSISTANT_OPTS=-Xmx2048M export MAVEN_OPTS=-Xmx2048M
Checking Rules
We may enforce architectural constraints by defining rules with Cypher queries embedded in XML or AsciiDoc.
As jqAssistant is capable of reading rules from AsciiDoc, we may use it as a tool for living documentation – there is a nice example available for this, the “jqAssistant Spring Pet Clinic Demo Output“.
We’re now writing a sample constraint that forbids classes in the package com.hascode.control to access classes in the package com.hascode.boundary (assuming some layer contract).
In the Neo4j browser, the query for our sample project looks like this one:
We’re writing this constraint in AsciiDoc in a file named myrules.adoc in a directory named jqassistant in our project directory.
What we’re doing here .. we’re
- creating a new group named default
- this group includes our constraint named myrules:LayerAccessConstraint and specifies a severity = blocker
- creating a constraint named myrules:LayerAccessConstraint and a severity = blocker
- this constraint is defined by our following cypher query .. if the query returns something, it fails..
[[default]] .A collection of architectural constraints. [role=group,severity=blocker,includesConstraints="myrules:LayerAccessConstraint"] == Architectural Constraints [[myrules:LayerAccessConstraint]] .Following the layer contract, the control layer may not access the boundary layer. [source,cypher,role=constraint,severity=blocker] ---- MATCH (control:Class)-[r*]->(boundary:Class) WHERE control.fqn STARTS WITH "com.hascode.control" AND boundary.fqn STARTS WITH "com.hascode.boundary" RETURN boundary ----
The advantage of using AsciiDoc is that we’re creating something readable – this is what my AsciiDoc editor in IntelliJ looks like:
A detailed explanation of the rules syntax can be found in the jqAssistant manual here.
This is the directory structure of our sample project where the constraint above is violated because ControlClass from the control layer accesses BoundaryClass from the boundary layer.
.
├── jqassistant
│ └── myrules.adoc
├── pom.xml
└── src
├── main
│ ├── java
│ │ └── com
│ │ └── hascode
│ │ ├── boundary
│ │ │ └── BoundaryClass.java
│ │ └── control
│ │ └── ControlClass.java
│ └── resources
└── test
└── java
The following goal triggers the validation of these rules
$ mvn jqassistant:scan jqassistant:analyze [..] [INFO] Executing analysis for 'jqassistant-tutorial'. [INFO] Reading rules from directory /data/project/jqassistant-tutorial/jqassistant [INFO] Executing group 'default' [INFO] Validating constraint 'myrules:LayerAccessConstraint' with severity: 'BLOCKER'. [ERROR] --[ Constraint Violation ]----------------------------------------- [ERROR] Constraint: myrules:LayerAccessConstraint [ERROR] Severity: BLOCKER [ERROR] Following the layer contract, the control layer may not access the boundary layer. [ERROR] boundary=com.hascode.boundary.BoundaryClass [ERROR] boundary=com.hascode.boundary.BoundaryClass [ERROR] boundary=com.hascode.boundary.BoundaryClass [ERROR] ------------------------------------------------------------------- [ERROR] [INFO] ------------------------------------------------------------------------ [INFO] BUILD FAILURE [INFO] ------------------------------------------------------------------------ [INFO] Total time: 8.808 s [INFO] Finished at: 2017-12-29T16:22:30+01:00 [INFO] Final Memory: 168M/1121M [INFO] ------------------------------------------------------------------------ [ERROR] Failed to execute goal com.buschmais.jqassistant:jqassistant-maven-plugin:1.3.0:analyze (default-cli) on project jqassistant-tutorial: Violations detected: 0 concepts, 1 constraints -> [Help 1]
As expected, our build fails and reports an error because our constraint was violated.
The following Bitbucket Pipeline Build demonstrates this result, too.
Showing existing and effective Rules
The following command enlists all known rules in the current environment:
$ mvn jqassistant:available-rules [..] [INFO] Available rules for 'jqassistant-tutorial'. [INFO] Reading rules from directory /data/project/jqassistant-tutorial/jqassistant [INFO] Groups [4] [INFO] "ejb3:EJB" [INFO] "junit4:Default" [INFO] "myrules:ArchitectureConstraintGroup" [INFO] "testng:TestNG" [INFO] Constraints [12] [INFO] "cdi:BeansMustNotUseFieldInjection" - CDI beans shall not use field injection (constructor and setter injections are fine.). [INFO] "cdi:BeansMustUseConstructorInjection" - All CDI beans must use constructor injection. [INFO] "dependency:ArtifactCycles" - There must be no cyclic artifact dependencies. [INFO] "dependency:PackageCycles" - There must be no cyclic package dependencies. [INFO] "jpa2:ValidationModeMustBeExplicitlySpecified" - The validation mode of all persistence units must be explicitly specified and either set to CALLBACK or NONE. [INFO] "junit4:AssertionMustProvideMessage" - All assertions must provide a message. [INFO] "junit4:IgnoreWithoutMessage" - All @Ignore annotations must provide a message. [INFO] "junit4:TestMethodWithoutAssertion" - All test methods must perform assertions (within a call hierarchy of max. 3 steps). [INFO] "maven3:HierarchicalParentModuleRelation" - If a parent Maven project declares a module then the parent project must also be declared as the parent of the module (i.e. to keep the project hierarchy consistent). [INFO] "myrules:LayerAccessConstraint" - Following the layer constract, the control layer may not access the boundary layer. [INFO] "osgi-bundle:InternalTypeMustNotBePublic" - Internal types must not be public if no depending types exist in other packages of the bundle. [..]
To see which rules are applied to the current project, we may use the following command:
mvn jqassistant:effective-rules
This command should enlist our custom groups and constraints, too!
Running Neo4j Server
To start an embedded instance of Neo4j we simply need to run the following command:
mvn jqassistant:server
Afterwards we may access and explore our graph database by pointing our browser to this location: http://localhost:7474/
I have added an example for exploring existing structures with the Neo4j browser at the end of this tutorial, please feel free to skip to this section.
Cypher Queries for Structure Analysis
The following queries are helpful to explore our architecture and code structures, examples for exploring a concrete system step by step can be found at the end of this tutorial here.
Show existing labels
The following query shows existing labels. When exploring our project sources, nodes labeled with Class, Type are often a good entry point for analyzing relations or aggregating code metrics.
MATCH (n) WITH DISTINCT labels(n) AS labels UNWIND labels AS label RETURN DISTINCT label ORDER BY label
Existing Labels of a sample project:
- Annotation
- Array
- Artifact
- Attribute
- Class
- Configuration
- Constructor
- Container
- Contributor
- Developer
- Directory
- Document
- Element
- Enum
- ExecutionGoal
- Field
- File
- Interface
- Java
- Json
- Key
- License
- Manifest
- ManifestEntry
- ManifestSection
- Maven
- Member
- Method
- Namespace
- Object
- Organization
- Package
- Parameter
- Participant
- Plugin
- PluginExecution
- Pom
- Primitive
- Profile
- Project
- Properties
- Property
- Repository
- Role
- Scalar
- ServiceLoader
- Text
- Type
- Value
- Xml
Display Relation Types for Classes
This query allows us to explore existing relation types for labeled nodes .. in this example we’re fetching all relation types for nodes labeled as Class.
MATCH (c:Class)-[r]->() RETURN DISTINCT TYPE(r)
Existing relations for classes in a sample project:
- EXTENDS
- ANNOTATED_BY
- DECLARES
- DEPENDS_ON
- IMPLEMENTS
- IS
Unique combination of relationship and labels
Showing a combination of labels and relationship types.
MATCH (n)-[r]-() RETURN DISTINCT LABELS(n), TYPE(r)
Display Indexes and Constraints
The following queries shows existing indexes and constraints. This might be only relevant when we need to rewrite a query for performance.
:schema
Indexes in a sample project:
Indexes ON :Artifact(fqn) ONLINE ON :Class(fqn) ONLINE ON :ColumnType(databaseType) ONLINE ON :Concept(id) ONLINE ON :File(fileName) ONLINE ON :Package(fqn) ONLINE ON :Pom(fqn) ONLINE ON :Project(fqn) ONLINE ON :Repository(url) ONLINE ON :Type(fqn) ONLINE ON :Value(value) ONLINE No constraints
Display Properties for Labeled Nodes
This query explores properties for nodes of a given label – in this example for nodes labeled with Type.
MATCH (t:TYPE) WITH DISTINCT(KEYS(t)) AS KEYS UNWIND KEYS AS KEY RETURN DISTINCT(KEY) AS KEY ORDER BY KEY ASC
Cypher Queries for Quality Metrics
Having explored our code structures and components we may now analyze weak points, code smells and possible hot-spots for refactoring.
There are plenty of possible metrics, some well-known basic ones like this (an explanation of these metrics can be found here).
- Abstractness
- Afferent Coupling
- Comment Lines of Code
- Cyclic Dependencies
- Cyclomatic Complexity
- Density of Comments
- Depth in Tree
- Direct Cyclic Dependencies
- Distance from the Main Sequence
- Efferent Coupling
- Encapsulation Principle
- Executable Statements
- Instability
- Limited Size Principle
- Modularization Quality
- Non-Comment Lines of Code
- Number of Abstract Types
- Number of Children in Tree
- Number of Concrete Types
- Number of Exported Types
- Number of Parameters
- Number of Types
- Response for Class
- Total Lines of Code
- Weighted Methods per Class
- Number Of Fields
- Number Of Attributes
The following single queries may be combined to aggregate new code or quality metrics.
Class with highest number of methods within a specific package
This query allows us to explore a given package and fetch classes with a high number of methods (possible code smell).
MATCH (class:Class)-[:DECLARES]->(method:Method) WHERE class.fqn =~ "com.hascode.*" RETURN class.fqn, COUNT(method) AS Methods ORDER BY Methods DESC
Class with deepest inheritance hierarchy
The deeper a classes inheritance level, the harder it is to predict its behavior because of the inherited methods, deeper trees involve greater design complexity (Chidamber, S. R. & Kemerer, C. F).
Therefore the following query shows us the classes with the deepest inheritance hierarchy in a given package.
MATCH p=(class:Class)-[:EXTENDS*]->(super:TYPE) WHERE class.fqn STARTS WITH "com.hascode" RETURN class.fqn, LENGTH(p) AS Depth ORDER BY Depth DESC LIMIT 20
Amount of relations between classes
This query returns the amount of relation between classes in a given package.
MATCH (c:Class)-[r]-() WHERE c.fqn =~ "com.hascode.*" RETURN c.fqn, COUNT(r) AS relations ORDER BY relations DESC
Amount of own/foreign classes
This query fetches the ratio of own and foreign classes.
MATCH (c:Class) MATCH (o:Class) WHERE c.fqn STARTS WITH "com.hascode." AND NOT o.fqn STARTS WITH "com.hascode." RETURN COUNT(DISTINCT(c)) AS amount_own, COUNT(DISTINCT(o)) AS amount_foreign
Find Ignored Tests
Find ignored tests either using the special label from the JUnit plugin (assuming JUni is used) using something like MATCH (method:Method:Ignore) RETURN method or this approach:
MATCH (c:Class)-[:DEPENDS_ON*]->(a:TYPE) WHERE a.fqn = "org.junit.Ignore" RETURN c
Find Long Method Names
Finds the methods with the longest names
MATCH (c:Class)-[:DECLARES]->(m:Method) WHERE c.fqn STARTS WITH "com.hascode" AND m.name IS NOT NULL RETURN c.fqn, m.name, SIZE(m.name) AS nameSize ORDER BY nameSize DESC LIMIT 20
Find Classes with many constructors
Returns a list of the classes with many constructors..
MATCH (class:Class)-[coRef:DECLARES]->(co:Constructor) WHERE class.fqn STARTS WITH "com.hascode" RETURN class.fqn, COUNT(coRef) AS constructors ORDER BY constructors DESC LIMIT 20
Find Methods with many parameters
Too many parameters for a method might be a code smell..
MATCH (class:Class)-[:DECLARES]->(m:Method)-[params:HAS]->(p:Parameter) WHERE class.fqn STARTS WITH "com.hascode" RETURN class.fqn, m.name, COUNT(params) AS paramCount ORDER BY paramCount DESC LIMIT 20
Find Constructors with many parameters
Same as above but now only for constructors..
MATCH (class:Class)-[:DECLARES]->(co:Constructor)-[params:HAS]->(p:Parameter) WHERE class.fqn STARTS WITH "com.hascode" RETURN class.fqn, COUNT(params) AS paramCount ORDER BY paramCount DESC LIMIT 20
Find SPI and Service Implementations
Finds service implementations using the service-loader mechanism.
Might be useful to scan if important SPIs are implemented.
MATCH (impl:Class)<-[REF:CONTAINS]-(loader:ServiceLoader)-[:OF_TYPE]->(t:TYPE) RETURN t.fqn AS Loaders, impl.fqn AS Service
Count non-abstract or non-interface types in package
This query counts non-abstract types in a package. This may be used for other queries e.g. to calculate the abstractness of a component.
MATCH (t:TYPE) WHERE t.fqn STARTS WITH "com.hascode" AND (t.abstract IS NULL OR NOT t:Interface) RETURN COUNT(t) AS nonabstract
Count abstract and interface types in package
This query counts abstract types in a package. This may be used for other queries e.g. to calculate the abstractness of a component.
MATCH (t:TYPE) WHERE t.fqn STARTS WITH "com.hascode" AND (t.abstract = TRUE OR t:Interface) RETURN COUNT(t) AS abstract
Abstractness of a Package
Combining both metrics above we may now calculate the abstractness of a package using the following query:
MATCH (t:Type) WHERE t.fqn STARTS WITH "com.hascode" AND (t.abstract IS NULL OR NOT t:Interface) WITH COUNT(t) AS nonabstract MATCH (t:Type) WHERE t.fqn STARTS WITH "com.hascode" AND (t.abstract = TRUE OR t:Interface) WITH nonabstract, COUNT(t) AS abstract RETURN toFloat(abstract) / toFloat(nonabstract) AS abstractness
More to come, if you have ideas, query improvements or new queries, please feel free to post them as a comment – I’ll be happy to include them here! :)
Tutorial Sources
Please feel free to download the tutorial sources from my Bitbucket repository, fork it there or clone it using Git:
git clone https://bitbucket.org/hascode/jqassistant-tutorial.git
Resources
- jqAssistant Website
- Neo4j Website
- Neo4j: Introduction to the Cypher Query Language
- jqAssistant: Getting Started
- R.E.M Web Development: Some basic and useful Cypher queries for Neo4j (some outdated)
- jqAssistant Demo: Spring Pet Clinic
- Chidamber, S. R. & Kemerer, C. F. (1994). A Metrics Suite for Object Oriented Design (IEEE Transactions on Software Engineering, Vol. 20, No. 6). Retrieved May 14, 2011, from the University of Pittsburgh
- jqAssistant Manual: Rules Syntax
- Dr Andy Brooks: Metrics Overview
- Neo4j Blog: Graph Databases and Software Metrics Analysis
- AsciiDoctor: What is AsciiDoc
- Markus Harrer: My experiences with jqAssistant so far
- jqAssistant Blog: Yes we scan – Exploring libraries
- Cypher Refcard
- Official Neo4j Image on DockerHub
Partial Alternative: ArchUnit
For validating architectural constraints there is another library that might be of interest: ArchUnit.
I have written a short how-to/tutorial about this library here: “Assuring Architectural Rules with ArchUnit“.
Other Neo4j Tutorials
If you’re interested in Neo4j, please feel free take a look at these articles of mine:
- Object Graph Mapping by Example with Neo4j OGM and Java
- A short Overview of Neo4j Indexing Strategies
- Neo4j Graph Database Tutorial: How to build a Route Planner and other Examples
Appendix A: Exemplary Exploration of a Library
A possible option to explore a system is to start at a well-known class and dig through its dependencies, fields and references.
Entry Point: Class
For this example, I’m choosing the class DslProcessorImpl from a library of mine as a starting point so this is our first simple query to get an entry point:
MATCH (c:Class) WHERE n.fqn ENDS WITH "DslProcessorImpl" RETURN n
This result looks like this in the Neo4j browser:
As expected we see a single class – an in addition to that we can see the additional labels that the selected node possesses: Class, File, Java and Type.
We may now explore the relations of this class by double-clicking on the node in the graphical view – this leads to the following view:
We see a lot of additional nodes as well as a list of named relations like CONTAINS, DEPENDS_ON, DECLARES and many others.
With this information we may create additional queries to analyze the system.
Given the relation DECLARES allows us to fetch the fields that our class declares:
MATCH (c:Class)-[:DECLARES]->(f:FIELD) WHERE c.fqn ENDS WITH "DslProcessorImpl" RETURN f AS FIELDS
Switching between the graph view and the table view is helpful depending on the output of our queries in the Neo4j browser.
Using another named relation, DEPENDS, we may explore dependencies of our class:
MATCH (c:Class)-[:DEPENDS_ON]->(dependency) WHERE c.fqn ENDS WITH "DslProcessorImpl" RETURN dependency ORDER BY dependency.fqn ASC
Or we might take a look at interfaces that our class implements:
MATCH (c:Class)-[:IMPLEMENTS*]->(super) WHERE c.fqn ENDS WITH "DslProcessorImpl" RETURN super
These queries are not that exciting as every IDE is faster here but they may be used to create basic entry points for more complex queries or to create constraints for validation our architectures.
Queries may be joined using WITH, UNION and other operators.
Besides from analyzing classes and their relations we may also explore other labels and pieces of information that is written to our graph database by the different plugins.
Entry Point: Artifact
The following query creates a list of Jar-file artifacts from our sample-project and produces an output in the format “groupid:artifactid:version“.
MATCH (a:Artifact) WHERE a.TYPE = "jar" WITH a.name+":"+a.GROUP+":"+a.version AS GAV WHERE GAV IS NOT NULL RETURN GAV ORDER BY GAV ASC
Entry Point: Project
Another interesting entry point for an analysis might be nodes labeled as Project as fetched by the following query:
MATCH (p:Project) RETURN p
I have changed the look of the graph view by giving nodes labeled as projects a red background and increasing their size.
Appendix B: jqAssistant Reports
The following Maven goal allows us to produce some HTML output in no time including the results of our validation process (see “Checking Rules”):
mvn jqassistant:report
The result for our sample project looks like this one:
Appendix C: Remote Neo4j Server with Docker
Sometimes it’s more practical to store our analysis data rather in some remote Neo4j server than in a local store.
As there is an official Neo4j Docker image on DockerHub, it’s really easy to demonstrate the analysis process using a Docker image.
We’re starting our Neo4j server with the following command:
docker run -td --rm -p 7474:7474 -p 7687:7687 neo4j:3.3.2
Afterwards when the server is running, we need to reset the credentials for the user neo4j once using the web admin panel at http://localhost:7474.
In addition we need to add the following configuration to our project’s pom.xml:
<plugin> <groupId>com.buschmais.jqassistant</groupId> <artifactId>jqassistant-maven-plugin</artifactId> <version>1.3.0</version> <configuration> <store> <uri>bolt://localhost:7687</uri> <username>neo4j</username> <password>neo4j</password> </store> </configuration> </plugin>
The console output when running mvn jqassistant:scan again indicates that a remote server is used to store the results:
$ mvn jqassistant:scan [..]- [INFO] Loaded jQAssistant plugins [CDI, Common, Common Test, Core Analysis, EJB3, GraphML, JAX-RS, JPA 2, JSON, JUnit, Java, Java 8, Java EE 6, Maven 3, OSGi, RDBMS, TestNG, Tycho, XML, YAML]. [INFO] Connecting to store at 'bolt://localhost:7687' (username=neo4j) [INFO] Resetting store. [INFO] Reset finished (removed 0 nodes, 0 relations). [INFO] Entering /data/project/jqassistant-tutorial/target/classes [INFO] Leaving /data/project/jqassistant-tutorial/target/classes (6 entries, 63 ms) [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 5.703 s [INFO] Finished at: 2018-01-28T17:20:28+01:00 [INFO] Final Memory: 23M/398M [INFO] ------------------------------------------------------------------------ [INFO] Closing store in directory 'bolt://localhost:7687'.
Article Updates
- 2018-02-19: Metrics for package abstractness added, comments activated.
- 2018-01-28: Examples for remote Neo4j server and Docker added.
- 2018-01-02: Queries for long method names, classes with many constructors, methods/constructors with many parameters, and service-loaders and services added
Tags: abstractness, adoc, analysis, architecture, artifact, asciidoc, asciidoctor, constraints, coupling, cypher, graph, jpa, jqassistant, junit, metrics, neo4j, nosql, quality, query, rules, serviceloader, validation, xml