RR Social Networks

Each social network was computed by counting the number of times each pair of characters appeared within 30 tokens (words) of each other in each book. The node size for each character scales logarithmically with the number of mentions of the character, as do the edge widths for each pair. The colors of the nodes are based on the character's "Color" in the books. The networks were created and visualized using networkx and pyvis, and all algorithms used (betweenness/eigenvector/degree/closeness centrality and Louvain Community Partitions) were used through networkx.

I highly recommend reading each book before viewing its social network or reading my general description below, as they do contain spoilers.

If you are interested, the source code for this project is available on my GitHub. I used all_networks.ipynb to generate the social networks. You will not be able to reproduce the results unless you have all 6 books in .txt format in the root directory (with "POV-[character name]" added at each point the narrator changes). I have the files but I don't think I'm allowed to keep them publicly hosted for legal reasons.

More in-depth description (with spoilers)

Since the books are written in first person (from Darrow's perspective for books 1-3, then from several different perspectives), the POV characters were mainly referred to as "I", "me", "my", "we", etc. I manually added the text "POV-[character]" each time it switched, and then automatically changed each use of first person language into the POV character's name. Additionally, many characters had several nicknames (e.g. Darrow is "Darrow", "Reaper", "Andromedus", etc.). I manually added each nickname to a dict "name_mapping", then automatically replaced each instance of a character's actual name or nicknames with the character's "main" name that I assigned them (e.g. Darrow is "darrow-red", Virginia/Mustang is "virginia-gold", etc.).

I initially tried using NLTK to identify characters, but there were problems with this- many characters were repeated under different names, and many nouns were identified as characters (e.g. a SlingBlade). Two characters also share a name- Pax and Pax. To simplify things, I ensured that all instances of Pax in books 1-3 refer to the first, pax_telemanus-gold, and those in 4-6 refer to the second, pax_augustus-gold. Manually defining this naming convention for all ~275 characters in the series (according to RR Wiki) was time consuming, but it was the best way to ensure that the social networks were accurate.

One other interesting note is that the social networks for books 4-6 are significantly less dense than those for books 1-3. The existence of 5 POV characters (Darrow, Ephraim, Lyria, Lysander, and Virginia) instead of 1 resulted in each character having a cluster of side characters around them, many of which were not connected to any other main characters. Some characters transcended this, including Sevro, Volga, Victra, Pax (Augustus), Atalantia, and Cassius, who were all connected to at least 2 of the 5 POV characters. This is reflected in the social networks, and shows how these characters impact many different storylines.

The last problem I encountered was that there was significant bleed between chapters- the last few sentences in one POV would often create connections to the first after a POV switch. After splitting the books at each POV switch and replacing the first-person language, I rejoined the paragraphs with the word " x " repeated 30 times between them. This ensured that there were no connections created across chapters.

After creating the graphs, I used networkx to compute several different centrality measures for each character across all 6 books- betweenness, eigenvector, degree, and closeness. I then normalized each by dividing by the maximum value for each (which was Darrow for all 4 measures). I then added these 4 values together into a single "aggregated_centrality" dictionary, to find the characters that were most central across all 6 books. The top 20 characters, by aggregated_centrality, are shown below.

Top 20 characters by aggregated centrality

Rank	Character	Aggregated Centrality
1	Darrow	4.00
2	Virginia	1.81
3	Lysander	1.68
4	Sevro	1.39
5	Adrius	1.13
6	Cassius	1.12
7	Victra	1.06
8	Lyria	0.93
9	Fitchner	0.90
10	Aja	0.86
11	Dancer	0.85
12	Ragnar	0.83
13	Lilath	0.80
14	Roque	0.80
15	Lorn	0.78
16	Clown	0.78
17	Ephraim	0.77
18	Octavia	0.70
19	Kavax	0.68
20	Kieran	0.68

These measurements are obviously not precise or rigorous, and it is clear that they are heavily biased by characters in the earlier books- side characters in the first three books sometimes had higher centrality than even POV characters had in the later books (Ephraim had an eigenvector centrality of 0.088 in Iron Gold, which is fewer than that of 21 different characters in Book 1, Red Rising). Thus, characters introduced earlier are overrepresented in their aggregated centrality. Characters like Aja, Ragnar, Roque, and Lorn are all dead by the end of book 3, yet their centrality in the earlier books and occasional mentions in the later books are enough to place them above even Ephraim, who is a major POV character for two entire books. More rigorous analysis could more accurately determine the most central characters, but this provides a fairly accurate estimate for now.

Finally, I used the Louvain Community Detection algorithm in networkx to see how accurate its partitions were. There is no concrete community data to compare with, but the results were fairly accurate. For Red Rising, the algorithm was able to identify many of the major cliques- when I set the resolution to 3.0, it identified the communities:

Louvain Communities in Red Rising

{'antonia-gold', 'cipio-gold'}
{'cassandra-gold', 'titus-red', 'vixus-gold', 'pollux-gold'}
{'mickey-violet', 'modjob-brown', 'harmony-red', 'dancer-red'}
{'dio-red', 'eo-red'}
{'thistle-gold', 'lea-gold', 'quinn-gold', 'roque-gold'}
{'kieran-red', 'leanna-red'}
{'adrius-gold', 'lilath-gold'}
{'nyla-gold', 'milia-gold', 'dax-gold'}
{'pax_telemanus-gold', 'novas-gold'}
{'clown-gold', 'weed-gold', 'pebble-gold'}
{'tamara-gold', 'tactus-gold'}

Interestingly, neither Darrow, Cassius, nor Sevro were identified as being in any of the communities. This is likely because they all have extremely high centrality, and the algorithm could not move any of them into a community without significantly decreasing the modularity of the partition. Increasing the threshold may result in more accurate communities.

However, for the latter half of the series, even with the same resolution and threshold, the algorithm returned significantly larger communities, which were extremely accurate to the book. In Iron Gold, the following community partitions were returned with a resolution of 3.0:

Louvain Communities in Iron Gold

{'alexandar-gold', 'winkle-green'}
{'cassius-gold', 'lysander-gold', 'gaia-gold', 'dido-gold', 'pandora-gold', 'revus-gold', 'seraphina-gold', 'goroth-obsidian', 'romulus-gold', 'diomedes-gold', 'pytha-blue'}
{'darrow-red', 'colloway-blue', 'tharsus-gold', 'sevro-gold', 'publius-copper', 'virginia-gold', 'victra-gold', 'thraxa-gold', 'julia-gold', 'apollonius-gold', 'cedric-copper', 'kieran-red', 'tongueless-obsidian', 'rhonna-red', 'dan-gray', 'sefi-obsidian', 'orion-blue'}
{'pax_augustus-gold', 'electra-gold'}
{'ephraim-gray', 'cyra-green', 'holiday-gray', 'volga-obsidian', 'gorgo-obsidian', 'kobachi-green', 'oslo-white', 'dano-red', 'trigg-gray'}
{'hjornir-obsidian', 'marius-gold'}
{'brutus-gold', 'octavia-gold', 'lorn-gold', 'silenius-gold'}
{'daxo-gold', 'barlow-red', 'kavax-gold', 'dancer-red', 'liam-red', 'tiran-red', 'theodora-pink', 'lyria-red', 'liago-yellow', 'niobe-gold'}
{'magnus-gold', 'atalantia-gold'}
{'aja-gold', 'ragnar-obsidian', 'lilath-gold', 'wulfgar-obsidian', 'fitchner-gold', 'roque-gold', 'adrius-gold', 'atlas-gold'}
{'pebble-gold', 'clown-gold', 'milia-gold', 'weed-gold'}

With the exception of the penultimate group (Ragnar, Lilath, Fitchner, Aja and Atlas certainly should not be in a single community), each community is very accurate to the set of allies in the book. The hyperparameters could likely be optimized if an optimal set of partitions/factions were created to compare against, but for now I will simply be impressed with the algorithm's performance.

All rights to the Red Rising saga and its IP belong to Pierce Brown.
My own website is available here.