Aggregating Data with Neo4J in Java

I ran into an interesting challenge while writing a MATCH query and how the Neo4j Bolt driver wraps up the response. Consider the following:

CREATE (g:Game { id: '1234', name: 'TI4', description: 'TI4', yearPublished: 2017, minPlayers: 3, maxPlayers: 6, gameType: 'BASE' })
CREATE (e:Game { id: '5678', name: 'Prophecy of Kings', description: 'PoK', yearPublished: 2020, minPlayers: 3, maxPlayers: 8, gameType: 'EXPANSION'})
CREATE (e2:Game { id: '9101', name: 'Expansion 2', description: 'EX2', yearPublished: 2020, minPlayers: 3, maxPlayers: 8, gameType: 'EXPANSION'})
CREATE (g)-[:HAS]->(e)
CREATE (g)-[:HAS]->(e2)

CREATE (m1:Mechanic {id: '1234', name: 'Action Drafting'})
CREATE (m2:Mechanic {id: '5678', name: 'Area Majority / Influence'})
CREATE (g)-[:HAS]->(m1)
CREATE (g)-[:HAS]->(m2)

CREATE (cat1:Category {id: '1234', name: 'Civilization'})
CREATE (cat2:Category {id: '5678', name: 'Economic'})
CREATE (g)-[:HAS]->(cat1)
CREATE (g)-[:HAS]->(cat2)

What I wanted to do was get the TI4 game, it’s expansions, categories, and mechanics. My first attempt looked liked this:

MATCH (g:Game {name:$name})-[:HAS]->(e:Game)
	,(g)-[:HAS]->(c:Category)
	,(g)-[:HAS]-(m:Mechanic)
RETURN 
         g
       ,COLLECT(e) AS expansions
       ,COLLECT(c) AS categories
       ,COLLECT(m) AS mechanics

In doing so, the results ended up aggregating in multiples of two. So instead of 2 expansions, categories, and mechanics, I ended up with 2x4 of each of those, repeated. So 8 exapnsions (2 repeated 4 times), etc. To this day, I can’t explain why it works that way, but the solution was simple. DISTINT. Now, I tried multiple ways to apply DISTINCT and after some help from Stackoverflow, found this as the solution:

MATCH (g:Game {name:$name})-[:HAS]->(e:Game)
	,(g)-[:HAS]->(c:Category)
	,(g)-[:HAS]-(m:Mechanic)
RETURN 
         g
       ,COLLECT(DISTINCT e) AS expansions
       ,COLLECT(DISTINCT c) AS categories
       ,COLLECT(DISTINCT m) AS mechanics

Let’s look at how we can take those results and wrap it up nicely in Java. First we simply need to execute the query and get our Result back:

try (var session = driver.session()) {
	return session.readTransaction(tx -> {
	    final var result = tx.run(query, parameters("name", name));
	    return extractGame(result);
    });
}

I created a method to extract the game so that I can reuse it for other similar queries.

private Optional<Game> extractGame(Result result) {
    final var maybeGame = result.stream().findFirst();
	if (maybeGame.isPresent()) {
		final var gameRecord = maybeGame.get();
		final var gameMap = gameRecord.get("g").asMap();
		final var expansionMaps = gameRecord.get("expansions").asList(MapAccessor::asMap);
		final var categoryMaps = gameRecord.get("categories").asList(MapAccessor::asMap);
		final var mechanicMaps = gameRecord.get("mechanics").asList(MapAccessor::asMap);
		return Optional.of(Game.convert(gameMap, expansionMaps, categoryMaps, mechanicMaps));
	}
	return Optional.empty();
}

Since I know I can only have one game with a given name, I just find the first one and check that I did indeed get it. Then, I use the Record to get the node, but as a map. I do the same thing with all the associated data but as a list of maps, since they are collections. I pass all these maps to a converter method in my Game record, which calls converter methods in the associated record classes.

public static Game convert(Map<String, Object> input, List<Map<String, Object>> expansions, List<Map<String, Object>> categories, List<Map<String, Object>> mechanics) {
	return new Game( //
		(String) input.get("id"), //
		(String) input.get("bggId"), //
		(String) input.get("name"), //
		(String) input.get("description"), //
		Math.toIntExact(Long.parseLong(String.valueOf(input.get("yearPublished")))), //
		Math.toIntExact(Long.parseLong(String.valueOf(input.get("minPlayers")))), //
		Math.toIntExact(Long.parseLong(String.valueOf(input.get("maxPlayers")))), //
		GameType.valueOf((String) input.get("gameType")), //
		(String) input.get("image"),//
		convertMany(expansions), //
		Mechanic.convertMany(mechanics), //
		Category.convertMany(categories) //
	);
}

And yes, that Math.toIntExact is bonkers. My plan is to move that conversion using Neo4j functions into the query. “But Gregg, “, you begin. “Why don’t you just use something like neo4j-ogm or spring-data-neo4j?”. Well, I might at some point. But I have learned over the years that I’m better with those libraries when I understand how they work rather than simply relying on the fact that they work.

comments powered by Disqus