RR Social Networks
Each social network was computed by counting the number of times each pair of characters appeared within
30 tokens (words) of each other in each book. The node size for each character scales logarithmically with
the number of mentions of the character, as do the edge widths for each pair. The colors of the nodes are based on
the character's "Color" in the books. The networks were created and visualized using networkx and pyvis, and all
algorithms used (betweenness/eigenvector/degree/closeness centrality and Louvain Community Partitions) were used
through networkx. These measurements are obviously not precise or rigorous, and it is clear that they are heavily biased by
characters in the earlier books- side characters in the first three books sometimes had higher centrality than even
POV characters had in the later books (Ephraim had an eigenvector centrality of 0.088 in Iron Gold, which is fewer
than that of 21 different characters in Book 1, Red Rising). Thus, characters introduced earlier are overrepresented in their
aggregated centrality. Characters like Aja, Ragnar, Roque, and Lorn are all dead by the end of book 3, yet their centrality
in the earlier books and occasional mentions in the later books are enough to place them above even Ephraim, who is a major POV
character for two entire books. More rigorous analysis could more accurately determine the most central characters, but this
provides a fairly accurate estimate for now.
I highly recommend reading each book before viewing its social network or reading my general description below, as they
do contain spoilers.
If you are interested, the source code for this project is available on my
GitHub. I used all_networks.ipynb to generate the social networks.
You will not be able to reproduce the results unless you have all 6 books in .txt format in the root directory (with
"POV-[character name]" added at each point the narrator changes). I have the files but I don't think I'm allowed to keep
them publicly hosted for legal reasons.
More in-depth description (with spoilers)
Since the books are written in first person (from Darrow's perspective for books 1-3, then from several different
perspectives), the POV characters were mainly referred to as "I", "me", "my", "we", etc. I manually added the text
"POV-[character]" each time it switched, and then automatically changed each use of first person language into the
POV character's name. Additionally, many characters had several nicknames (e.g. Darrow is "Darrow", "Reaper",
"Andromedus", etc.). I manually added each nickname to a dict "name_mapping", then automatically replaced each
instance of a character's actual name or nicknames with the character's "main" name that I assigned them (e.g.
Darrow is "darrow-red", Virginia/Mustang is "virginia-gold", etc.).
I initially tried using NLTK to identify characters,
but there were problems with this- many characters were repeated under different names, and many nouns were identified as characters (e.g.
a SlingBlade). Two characters also share a name- Pax and Pax. To simplify things, I ensured that all instances of Pax
in books 1-3 refer to the first, pax_telemanus-gold, and those in 4-6 refer to the second, pax_augustus-gold. Manually
defining this naming convention for all ~275 characters in the series (according to
RR Wiki) was time consuming, but it was the best way to
ensure that the social networks were accurate.
One other interesting note is that the social networks for books 4-6 are significantly less dense than those for
books 1-3. The existence of 5 POV characters (Darrow, Ephraim, Lyria, Lysander, and Virginia) instead of 1 resulted in
each character having a cluster of side characters around them, many of which were not connected to any other main characters.
Some characters transcended this, including Sevro, Volga, Victra, Pax (Augustus), Atalantia, and Cassius, who were all
connected to at least 2 of the 5 POV characters. This is reflected in the social networks, and shows how these characters
impact many different storylines.
The last problem I encountered was that there was significant bleed between chapters- the last few sentences in one POV would
often create connections to the first after a POV switch. After splitting the books at each POV switch and replacing
the first-person language, I rejoined the paragraphs with the word " x " repeated 30 times between them. This ensured that
there were no connections created across chapters.
After creating the graphs, I used networkx to compute several different centrality measures for each character across all
6 books- betweenness, eigenvector, degree, and closeness. I then normalized each by dividing by the maximum value for each
(which was Darrow for all 4 measures). I then added these 4 values together into a single "aggregated_centrality" dictionary,
to find the characters that were most central across all 6 books.
The top 20 characters, by aggregated_centrality, are shown below.
Top 20 characters by aggregated centrality
Rank
Character
Aggregated Centrality
1
Darrow
4.00
2
Virginia
1.81
3
Lysander
1.68
4
Sevro
1.39
5
Adrius
1.13
6
Cassius
1.12
7
Victra
1.06
8
Lyria
0.93
9
Fitchner
0.90
10
Aja
0.86
11
Dancer
0.85
12
Ragnar
0.83
13
Lilath
0.80
14
Roque
0.80
15
Lorn
0.78
16
Clown
0.78
17
Ephraim
0.77
18
Octavia
0.70
19
Kavax
0.68
20
Kieran
0.68
Finally, I used the Louvain Community Detection algorithm in networkx to see how accurate its partitions were. There is no
concrete community data to compare with, but the results were fairly accurate. For Red Rising, the algorithm was able to
identify many of the major cliques- when I set the resolution to 3.0, it identified the communities:
Louvain Communities in Red Rising
Interestingly, neither Darrow, Cassius, nor Sevro were identified as being in any of the communities. This is likely because
they all have extremely high centrality, and the algorithm could not move any of them into a community without significantly
decreasing the modularity of the partition. Increasing the threshold may result in more accurate communities.
However, for the latter half of the series, even with the same resolution and threshold, the algorithm returned significantly
larger communities, which were extremely accurate to the book. In Iron Gold, the following community partitions were returned
with a resolution of 3.0:
Louvain Communities in Iron Gold
With the exception of the penultimate group (Ragnar, Lilath, Fitchner, Aja and Atlas certainly should not be in a single community), each community
is very accurate to the set of allies in the book. The hyperparameters could likely be optimized if an optimal set of partitions/factions were created to compare against,
but for now I will simply be impressed with the algorithm's performance.
All rights to the Red Rising saga and its IP belong to Pierce Brown.
My own website is available here.