Many cases of radicalism, especially violent radicalism, manifest within the context of inter-ethnic conflict and war. Research in this domain has significantly contributed to our overarching comprehension of the issue. This form of radicalism is inherently linked to the dynamics of group affiliation. However, our focus here is directed towards the individual motivation driving radical actions. While radical actions may encompass dynamics of group affiliation (e.g., the most radical pro-life activists being members of the Army of God), they also possess a substantial individual component. For instance, numerous attacks on abortion doctors or clinics are categorized as ’lone-wolf terrorism’.

In this study, we sought to examine whether state-of-the-art Machine Learning tech- nology for text analysis could effectively discern the intent behind radical actions from user-generated content. We analyzed a diverse array of radical texts published online by various activists, including anarchists, Iranian guerrilla members, Indian revolution- aries, activists from the US civil rights movement, and The Army of God. Our analysis reveals that Machine Learning models can successfully identify potential signals indica- tive of the author’s intent behind radical actions.

The internet assumes a significant role in facilitating violent extremism and terrorism (Conway, 2016). It serves as the most accessible and rapid means for the dissemination of extremist views. Extensive research has already been conducted in the analysis and detection of online extremism in user-generated content (Fernandez & Alani, 2021). However, pinpointing the moment when extreme views transition into extreme actions presents a notably challenging problem. The primary challenge lies in the extreme difficulty, or even impossibility, of gaining access to perpetrators. In our case, contacting the offenders proved unfeasible, given that they are either deceased or, in the case of certain members of the Army of God, incarcerated. All attempts to reach out to living convicts yielded no response. Consequently, the sole available data were the writings left in the public domain by the perpetrators.

Several examples illustrate the Internet’s role as a tool for disseminating extreme views and opinions preceding an attack. In 2009, Scott Roeder, an anti-abortion activist, murdered George Tiller, a doctor who performed late-stage abortions in the United States.1 Before the attack, Roeder expressed his views against Tiller’s work in publicly accessible writings. Another instance involving the use of the internet to express intent prior to a violent attack is the case of the Holocaust Museum shooter, James von Brunn.2 Von Brunn published his views on his Anti-Semitic website. However, numerous cases exist where individuals manifest and share extreme opinions, sometimes explicitly stating an intention for violent action but never acting on their threats. This prompts the question of whether there are non-explicit signals in user-generated content that can indicate an intent while also serving as a warning that a radical action may indeed be carried out.

According to the Dictionary of Law Enforcement (Goodier, 2008), intent is defined as “The state of mind of one who aims to bring about a particular consequence”. The concept of warning behaviours in threat assessment has been extensively studied (Reid Meloy et al., 2011), and a topology of different warning behaviours has been proposed. These include pathway, fixation, identification, novel aggression, energy burst, leakage, directly communicated threat, and last resort warning behaviours. Some of these warning behaviours have already been explored as potential signals of violent action observable on social media. It is crucial to acknowledge that extremism and radicalism are politically charged terms, with no consensus on their precise meanings. Furthermore, despite the numerous scientific studies on radicalisation, there is no universally accepted definition (Borum, 2011). For our purposes, we define radicalisation as a willingness to engage in various forms of activism, and/or acceptance of justification for the use of violence in defence of a cause, and/or readiness to use violence in defence of that cause.

The issue of automatic intent detection from user-generated content has received considerable attention, with a primary focus on applications in marketing and customer service. However, applying this work to distinguish authors posting threatening messages online from those genuinely willing and capable of carrying out violent acts could have practical implications, particularly in law enforcement. Viewing radicalisation as a spectrum of behaviours, ranging from non-violent to violent, holds significant consequences for real-world applications. As emphasised by McCauley and Moskalenko (McCauley & Moskalenko, 2017), understanding the distinction between radicals who accept violence and those who do not is crucial. Targeting non-violent radicals with radicalisation programs may inadvertently lead them towards violent radicalisation. Thus, it is paramount to demonstrate that individuals with strong opinions on a subject may not necessarily be prepared to resort to violence as a justified means in defence of a cause. Aligning with recent research on the subject, we propose differentiating between violent and non-violent radicalisation (Bonino, 2016; Gøtzsche-Astrup, 2018; Perry, 2018).

In this study, we explore whether state-of-the-art Machine Learning techniques can predict if a text document containing extreme content indicates a readiness for extremist violence. To accomplish this, we assembled a dataset of documents to assist in discerning texts written by individuals ready to commit violent acts in defence of a cause from those who endorse violence as a strategy but do not engage in violent actions themselves.

Research on online extremism has progressed rapidly over the past decade (Fernandez & Alani, 2021). The primary focus of this research has been the development of computational techniques for detecting online extremism (Gaikwad et al., 2021). Social media platforms have often served as the data source, with studies exploring forums, blogs, websites, and magazines. Common trends involve extracting semantic and linguistic features from text content, using them as representations to train Machine Learning (ML) models. These trained models are then applied to predict whether new, unseen text, not used in the training process, contains extreme content or not. A related area of interest is the study of hate speech (Al-Hassan & Al-Dossari, 2019; Fortuna & Nunes, 2018), with a majority of work focused on the automatic detection of hate speech in online user-generated content. Similar approaches to extremism detection involve feature extraction from text content and the application of various ML models.

Intent detection, on the other hand, has primarily been studied in commercial contexts (Chen et al., 2017; Dai et al., 2006; Gupta et al., 2014; Hollerit et al., 2013; Kim et al., 2016; Wang et al., 2015). This research aims to detect the intentions of potential customers from user-generated content, such as identifying the intention to buy in discussion forums (Chen et al., 2017), detecting commercial intentions in users’ search queries (Dai et al., 2006), and predicting the intention of buying a product in social posts (Gupta et al., 2014; Hollerit et al., 2013; Wang et al., 2015). An interesting exploration by Nobles et al. (Nobles et al., 2018) investigated whether the daily content of text messages could be used to predict risk factors for fatal suicide.

While the detection of violent action intentions from user-generated content has seen limited exploration, Cohen et al. (Cohen et al., 2013) delved into tools for detecting weak signals of radical violence in social media, identifying linguistic markers for radical violence. Their focus was on warning behaviours, including leakage, fixation, and identification, as defined by Reid et al. (Reid Meloy et al., 2011). Techniques recommended involved extracting terms from textual content and matching them with predefined vocabularies, such as predefined word lists of violent actions. A similar problem was examined by Kaati et al. (Kaati et al., 2016), where the authors studied the written communication of ten different lone offenders before their engagement in violent attacks. They identified eight possible indicators present in textual content produced by lone offenders before their involvement in targeted violence. Simons et al. (Simons & Skillicorn, 2020) proposed addressing both intent and abuse detection simultaneously, considering textual content containing signals of intent and abuse as an indication of extremist violence. They constructed two models, one for detecting abuse and one for inferring intent, and combined them to obtain the final model for identifying signals of violence.

In our study, we delved into the application of ML to the task of detecting radical intent from textual content. To achieve this, we opted to apply well-known ML models and text vectorization techniques. Before detailing the proposed methodology in Section 3.2, we offer a high-level overview of the text classification problem.

3.1. Background

Text classification is a well-established task in Natural Language Processing, involving the development of algorithms to assign free-text documents to predefined categories. ML has proven successful in various text classification tasks, including spam filtering, sentiment analysis, and topic classification. ML algorithms cannot directly process raw text; therefore, the text needs to be converted into vectors (often referred to as embeddings) of real numbers that ML algorithms can handle. Numerous text representation techniques exist, ranging from naive binary term occurrence features to advanced context-aware feature representations. The performance of these techniques may vary depending on the specific task. In this study, we employ one frequency-based method called TF-IDF (Term Frequency–Inverse Document Frequency) (Aizawa, 2003) and a more recently developed language model, BERT (Bidirectional Encoder Representations from Transformers) (Devlin et al., 2018). We provide a brief explanation of each of these two techniques below.

TF-IDF (Term Frequency–Inverse Document Frequency) is a statistical measure used to calculate the relevance of a word to a specific document within a corpus. The key idea is to increase the importance of a word proportionally to the number of times it appears in the document while compensating for its frequency in the corpus. TF-IDF comprises a product of two statistics: term frequency (TF) and inverse document frequency (IDF). TF measures the relevance of a term t for document d and is calculated as:

(1)TF(t,d)=Number of times term t appears in document dTotal number of terms in document d

IDF measures how relevant the word is to the entire corpus and it is calculated with the following formula:

(2)IDF(t)=logTotal number of documents in the corpusNumber of documents with term t

The TF-IDF score is computed by multiplying the term frequency with the inverse document frequency. Each document in the corpus is represented by a TF-IDF vector with a length equal to the total number of words across all documents. Each position within the vector represents the TF-IDF weight calculated for a single term.

BERT (Bidirectional Encoder Representations from Transformers) represents a language model grounded in artificial neural networks designed to acquire real-valued vector representations of text through corpus analysis. The model’s overarching goal is to forge a vector space wherein words or sentences with akin semantic meanings exhibit closer proximity (e.g., Euclidean distance) than those lacking semantic association. For example, words such as “dog” and “cat” should be markedly closer than pairings like “dog” and “apple” or “cat” and “table.” Vectors evolve based on word co-occurrence in analogous contexts, signifying that if two words share similar contextual usage, their representations should align. Trained on Wikipedia and Book Corpus, encompassing 10,000+ books across diverse genres, BERT is freely available for download and implementation.

Figure 1 illustrates a text classification pipeline employing an ML and text vectorization approach. Initially, all documents from the training dataset undergo conversion into numerical vectors using a selected vectorization method, such as TF-IDF or BERT. These vectors are subsequently utilised with an ML method to train a classification model. During this stage, any ML method suitable for classification tasks can be applied (e.g., Decision Tree, Artificial Neural Network, etc.). In the evaluation phase, documents from the test dataset (unseen examples) must be converted into their numerical representation using the same vectorization technique. The resulting numerical vectors are then input into the classification model to obtain the final predictions.

Figure 1.
Text classification pipeline.
Figure 1.
Text classification pipeline.
Close modal

3.2. Radical Documents Classification

This study aims to investigate whether the identification of radical action intent from textual content can be formulated and implemented as a text classification task. Specifically, our goal is to develop an ML based solution capable of classifying a text document into categories of either intent or no intent. To achieve this, we have implemented the pipeline illustrated in Figure 1. TF-IDF and the BERT language model were employed to convert each document into a numerical vector. Due to the BERT model’s document length limitation (512 words), we divided each document into chunks of 512 words. After obtaining vectors for all chunks from BERT, the final document representation was calculated as their average. Additionally, we explored an alternative scenario where, instead of averaging the chunk vectors, each chunk is individually classified as intent or no intent, and the final prediction for the entire document is derived based on the individual predictions for all chunks.

The detailed pipeline of the implemented approach when using document embeddings is demonstrated in Algorithm 1. As the input we provided a corpus of text documents (D), each labelled (l) as intent or no intent, selected vectorisation technique (i.e. BERT or TFIDF) and an ML algorithm. In the training process of the classification model (M), each document (di) from the corpus D is first converted into its vector representation (embedding). When utilising the BERT vectorization method, each document is partitioned into chunks of length 512 words or less. For instance, a document labelled as intent with a length of 1012 words is divided into two chunks: one of length 512 and one of length 500. All chunks are converted into numerical vectors using the pre-trained BERT model, and their average results in the single vector representation of the entire document. Subsequently, a ML model for predicting the intent of radical action is trained using the vector representation of the training dataset. During classification of a new document, the vector representation is obtained using the same vectorization method as during training. The document embedding is then input into the ML model to obtain the final prediction, indicating whether a document represents an intent of radical action or not.

Algorithm 2 presents the pipeline when the classification of a document is performed based on the vector representation of its parts (chunks), rather than a single document embedding. In the training process of the classification model (M), each document (di) from the corpus D is divided into chunks and each chunk is assigned the same labels as the document. For example, for a document labelled as intent divided into two chunks, each of the chunks is labelled as intent and is added to the training set (T). This process is repeated for each document in the training set. Following this, the pre-trained BERT model is employed to obtain embedding vectors for all the chunks from all documents. The chunks’ embeddings, together with their assigned class labels, are used as the training dataset to train a classification model. As the output, we obtain a classification model M for performing the task of radical action intent prediction. When making predictions for a new document, the document is first divided into chunks, and each chunk is fed into the BERT model. The chunks’ embeddings are passed to the classification model (M) to obtain their predictions (i.e., whether the chunk of text contains intent of radical action or not). All the predictions are then combined via majority voting. This means that if the majority of the chunks are classified as intent, then the whole document is also classified as intent, and vice versa.

Algorithm 1.
Constructing a predictive model for detecting intent of radical action based on document embedding
Algorithm 1.
Constructing a predictive model for detecting intent of radical action based on document embedding
Close modal
Algorithm 2.
Constructing a predictive model for detecting intent of radical action based on document’s chunks embeddings
Algorithm 2.
Constructing a predictive model for detecting intent of radical action based on document’s chunks embeddings
Close modal

In this section, we delve into the details of the evaluation process conducted to address our research question. We discuss the dataset employed in the studies, perform some preliminary analyses, and scrutinize the results obtained through our implemented pipeline.

4.1. Data description

In earlier sections, we emphasised the importance of investigating individuals who may declare themselves as accepting violent means in defence of a cause. Equally crucial is the inclusion of participants who have already committed violent acts, accounting for violent radicalisation. To ensure a representative and diverse content collection, we gathered writings from various times, places, and people. The dataset includes online posts from authors who have either murdered or attempted to murder their political opponents in the name of a cause, as well as individuals expressing extreme views but refraining from violent actions. The texts are sourced from diverse members, encompassing the Army of God organisation, various anarchists, Iranian guerilla members, Indian revolutionaries, activists from the US civil rights movement (including Black Panthers members), and the militant wing of the US anti-abortion movement. We categorised the authors into two groups: offenders (those who committed violent acts) and non-offenders (those who endorse violence but did not engage in violent actions themselves).

The offenders’ texts comprise writings from individuals such as Shelley Shannon, Paul Jennings Hill, Michael Bray, John Brockhoeft, Eric Rudolph, Dennis Malvasi, Clayton Waagner, Ravachol, Émile Henry, Nestor Makhno, Vera Zasulich, Huey Percy Newton, Bhagat Singh, and Hamid Ashraf. Notably, this group includes individuals with a history of violence, such as bombings, arson, and assassinations. Clayton Waagner, for instance, orchestrated the anthrax hoax, sending envelopes with white powder to abortion clinics. Dennis Malvasi was involved in bombing abortion clinics, and Shelley Shannon, known for acid attacks and a bombing on an abortion clinic, attempted the murder of abortion doctor George Tiller. Eric Rudolph conducted a series of bombings that killed abortion clinic personnel among other victims. Other notable figures include anarchists Ravachol and Émile Henry, Ukrainian revolutionary Nestor Makhno, and Russian revolutionary Vera Zasulich.

The non-offenders group includes individuals like Rick Ellis, Cathy Ramey, Chuck Spingola, Bruce Evan Murch, David Trosch, Dan Holman, Randall Terry, Amir Parviz Pouyan, Masoud Ahmadzadeh Heravi, Martin Abern, Fredrick Allen Hampton Sr., Malcolm X, Michael “Cetewayo” Tabor, Stokely Carmichael, Vsevolod Mikhailovich Eikhenbaum, and Emma Goldman. Some of these individuals are not high-profile murderers or attackers, resulting in limited publicly available information. Others, like Malcolm X, Stokely Carmichael, and Emma Goldman, are well-known revolutionaries but have not committed violent acts. Table 1 provides an overview of the data used in our experimental evaluation.

Table 1.
Documents used in the study (70 in total, written by 30 authors).
IntentNo IntentNo. of authors
Army of God 14 
Anarchists 14 16 
Iranian guerilla 
Indian revolutionaries 
US civil rights 
Total 36 34 30 
IntentNo IntentNo. of authors
Army of God 14 
Anarchists 14 16 
Iranian guerilla 
Indian revolutionaries 
US civil rights 
Total 36 34 30 

4.2. Preliminary Analysis

As a preliminary analysis, we divided our dataset into two sets, one containing documents labelled as intent and the other as no intent. We initially calculated the average length of the documents, i.e., the average number of words. Notably, documents labelled as intent are, on average, twice as long (6933 words) as those in the no intent category (3324 words). Subsequently, we generated word clouds for each set, providing a visual representation of the text data to identify keywords. The word clouds for the intent and no intent sets are illustrated in Figure 2, with font size indicating the importance of each word for the respective document group.

Observing the visuals in Figure 2, it is notable that notions of God prominently appear in the writings of the intent cohort. This follows Atran’s argument that religion can act either as a buffer or a booster of radicalisation and conflict. For the no intent group, words such as “people,” “life,” and “struggle” appear more frequently. Some words appear in both groups, but they either seem neutral (“one”, “will”) or are expected for the topic (“abortion”). A more in-depth analysis of the lexical differences between the intent and no intent categories is provided later in the paper.

Figure 2.
Word clouds generated for the intent (right) and no intent (left) cohorts.
Figure 2.
Word clouds generated for the intent (right) and no intent (left) cohorts.
Close modal

Finally, we visualized the two sets of documents (i.e., intent vs. no intent) using the generated embedding vectors. The visualization was conducted in two ways: one utilizing the embeddings of the chunks and the other using the embeddings constructed for the entire documents. Document embeddings were obtained by averaging all chunk vectors for each document, and the embedding vectors were derived from the BERT pre-trained language model. To visualize the vectors in a 2-dimensional space, we applied the UMAP dimensionality reduction method (McInnes et al., 2018). The output visualizations are presented in Figure 3.

Figure 3.
Visualisation of documents and chunks BERT vectors for intent (grey) and no intent (black) with Umap dimensionality reduction method.
Figure 3.
Visualisation of documents and chunks BERT vectors for intent (grey) and no intent (black) with Umap dimensionality reduction method.
Close modal

From Figure 3, which visualises the document embedding vectors for the two groups, it is apparent that instances from the intent and no intent groups are not easily separable. This suggests that identifying patterns within the data may pose a challenging task for the ML models. Notably, the intent documents appear more centrally located, while the no intent documents are somewhat more spread out. This distribution could be influenced by the construction of document embeddings, where the averaging of chunks may result in the loss of important information, rendering the embeddings less representative.

Figure 3a, illustrates how the chunks from the two groups of documents spread in the 2-dimensional space. Notably, the number of chunks from the intent documents is considerably larger than the number of chunks from the no intent documents. This discrepancy arises because documents labelled as intent are generally longer, resulting in a greater number of chunks. The visual representation indicates that the two groups are not distinctly separable in the 2-dimensional space. However, a subgroup of intent documents appears well-separated from the rest of the corpus. This observation may suggest that while there are stylistic and semantic differences between the two groups of documents, some parts of them exhibit similarities. It could be hypothesised, for instance, that the beginnings and endings of intent and no intent documents are written in a similar manner, while the content and expression in the middle vary between the two categories.

4.3. Implementation details

The framework employed in this study was implemented in Python using TensorFlow 2.8 within Colab notebooks (Bisong, 2019). The BERT pre-trained model was sourced from the TensorFlow Hub repository, along with its corresponding pre-processing model, enabling the internal conversion of raw text input into a fixed-length input sequence for the BERT encoder. Each text input was converted into a 768-dimensional vector. For TF-IDF vectorization, the sklearn TfidfVectorizer class was utilised. As part of the vectorization process, all characters were converted to lowercase, and the text was tokenized into words. After removing English stop words, the IDF and TF metrics were calculated for each word across all documents.

Five distinct ML models were employed, including Support Vector Machine (SVM), Logistic Regression (LR), Multilayer Perceptron (MLP), AdaBoost (AB), and Random Forest (RF). The first experiment involved generating a single embedding for each document using TF-IDF and BERT (by averaging embeddings of document chunks) and training the ML models on these document embeddings. In the second study, the aim was to assess the accuracy of predicting the label of an entire document based on predictions made for its individual chunks. Each chunk’s embedding was assigned the same label as the document. During training, ML models were trained on the embeddings of individual chunks, predicting the label of a single chunk. In the evaluation phase, a new unseen document was split into chunks, and the pre-trained BERT model was applied to obtain embeddings for all of the chunks. Each chunk was then classified by a ML model, and predictions for all chunks were combined using the majority voting technique (Dong et al., 2019).

For evaluation purposes in both approaches, the leave-one-out cross-validation technique was employed. In this technique, the test dataset consists of only one instance, and the model is trained on all remaining instances. This process is repeated for the same number of times as there are instances in the dataset. The hyperparameters of each model were tuned using the grid search algorithm. To facilitate comparison of results across all models in both studies, F1 and Area Under the Curve scores were calculated and reported.

4.4. Results evaluation and comparison

In this section, we present and discuss the results obtained from our experimental evaluation. We examine the outcomes for document classification using the two approaches considered in this study: classifying documents represented by TF-IDF or BERT embeddings and documents represented as a list of BERT chunks. The corresponding results are displayed in Tables 3, 4, and 5. For both scenarios, five different classification models were employed. The results are reported in terms of macro F1 score and the Area Under the Curve (AUC). We also outline the confusion matrix laid out in the format as presented in Table 2.

Table 2.
Confusion Matrix.
Predicted NoIntentPredicted Intent
NoIntent no. of instances correctly classifiers as NoIntent no. of instances from NoIntent classified as Intent 
Intent no. of instances from Intent classified as NoIntent no. of instance correctly classified as Intent 
Predicted NoIntentPredicted Intent
NoIntent no. of instances correctly classifiers as NoIntent no. of instances from NoIntent classified as Intent 
Intent no. of instances from Intent classified as NoIntent no. of instance correctly classified as Intent 

Comparing Tables 3 and 4, it is apparent that TF-IDF embeddings tend to outperform BERT embeddings for the majority of the ML methods. In some cases, such as SVM (0.75 vs. 0.59 F1 score) or AdaBoost (0.53 vs. 0.45 F1 score), the difference in both F1 and AUC scores is rather high. This finding is somewhat surprising, considering the contextual understanding of language by the BERT model, which has demonstrated state-of-the-art performance in numerous NLP tasks. One potential explanation could be the length of the documents used in our experiments. Due to the input length restrictions of the BERT model, the documents had to be split into chunks, and their embeddings were then averaged, possibly resulting in some context loss. Additionally, it is worth noting that we employed the pre-trained BERT model, which was trained on a generic dataset. Fine-tuning the model specifically for the radical text analysis task could significantly enhance its performance. Unfortunately, in our case, the dataset was not sufficient for this purpose.

Table 3.
Results obtained by different ML models with TFIDF document embeddings.
SVMLRABRFMLP
Conf.Matrix 268927 24101125 17171521 23111026 2861422 
F1 0.75 0.69 0.53 0.69 0.7 
AUC 0.78 0.77 0.63 0.75 0.77 
SVMLRABRFMLP
Conf.Matrix 268927 24101125 17171521 23111026 2861422 
F1 0.75 0.69 0.53 0.69 0.7 
AUC 0.78 0.77 0.63 0.75 0.77 
Table 4.
Results obtained by different ML models with BERT document embeddings.
SVMLRABRFMLP
Conf.Matrix 2591917 2771323 16182016 23111323 2771422 
F1 0.59 0.7 0.45 0.65 0.69 
AUC 0.64 0.74 0.48 0.73 0.76 
SVMLRABRFMLP
Conf.Matrix 2591917 2771323 16182016 23111323 2771422 
F1 0.59 0.7 0.45 0.65 0.69 
AUC 0.64 0.74 0.48 0.73 0.76 

When comparing all five ML models, it is evident that they perform at a similar level in terms of F1 score and the AUC. AdaBoost was the only exception, achieving much lower F1 and AUC scores for both TF-IDF and BERT embeddings. Overall, SVM with TF-IDF outperformed all other models, achieving the highest F1 and AUC scores.

In the second experiment, all documents were represented as a list of BERT embeddings, calculated for their different parts (chunks). The ML models were trained to predict the labels of individual chunks. A testing sample was first divided into chunks, which were then classified into one of the two classes. Using the majority voting rule, the final decision was obtained for the testing sample. The results are presented in Table 5. When comparing the two document classification approaches, using single document embedding tends to perform much better than splitting documents into chunks. For both classes (i.e., intent and no intent), the first approach notably outperformed the second one in terms of F1 and AUC. This may indicate that the signal is hidden throughout the text, and looking only at small parts of the document is not sufficient to determine whether it indicates an intent of violent action. The majority voting technique makes the prediction based on the class labels assigned to individual chunks of text, hence it relies on the accuracy of the chunk classifier. Looking at the results, we can see that all of the ML models obtained a high false positive rate (i.e., no intent classified as intent). From the preliminary analysis of the data, we know that the documents belonging to the intent class are on average much longer. This means our dataset contains many more chunks labelled as intent, leading to an imbalanced class distribution. This can explain the fact that each of the models performs much better in predicting the samples from the intent class.

Table 5.
Results obtained by different ML models with documents represented as a list of chunks represented by BERT embeddings.
SVMLRABRFMLP
Conf.Matrix 826531 1123432 727432 826531 21131521 
F1 0.49 0.56 0.48 0.49 0.58 
AUC 0.55 0.59 0.53 0.6 0.6 
SVMLRABRFMLP
Conf.Matrix 826531 1123432 727432 826531 21131521 
F1 0.49 0.56 0.48 0.49 0.58 
AUC 0.55 0.59 0.53 0.6 0.6 

4.5. Model Interpretation

In the aftermath of our experimental evaluation, it became evident that utilizing TF-IDF representation of the data, yielded the most promising results in the intent vs. no intent classification task. TF-IDF, by assigning weights to terms based on their frequency in a document relative to the entire dataset, facilitates the identification of crucial terms for distinguishing documents. The higher the TF-IDF score, the more important the term is in discerning that document from others, providing interpretability to the model’s predictions.

To gain insights into the influential features contributing to the classification of documents into either intent or no intent categories, we conducted an analysis of terms with the highest TF-IDF scores for each class. Table 6 presents the top 50 words for both classes, sorted by their TF-IDF scores. Additionally, Table 7 compiles words that were unique to each category. The examination of these key terms offers valuable perspectives on the distinctive linguistic features characterizing intent and no intent documents, thereby enhancing our understanding of the model’s decision-making process.

Table 6.
Top 50 words ranked based on their TF-IDF values for intent or no intent documents.
IntentNo Intent
would, one, abort, time, right, could, back, day, law, god, life, two, tear, govern, kill, make, even, thing, first, believe, defend, murder, way, said, take, never, babi, unborn, new, around, duty, come, good, place, must, need, man, mean, say, work, children, see, may, go, state, human, still, also, know, moral, protect polit, mass, one, condit, must, life, say, know, would, work, even, exist, form, time, action, social, right, state, make, also, mean, man, kill, come, human, take, god, develop, new, without, truth, go, year, children, fact, hand, question, govern, thing, see, word, practic, murder, societi, bishop, believ, world, death, way, church 
IntentNo Intent
would, one, abort, time, right, could, back, day, law, god, life, two, tear, govern, kill, make, even, thing, first, believe, defend, murder, way, said, take, never, babi, unborn, new, around, duty, come, good, place, must, need, man, mean, say, work, children, see, may, go, state, human, still, also, know, moral, protect polit, mass, one, condit, must, life, say, know, would, work, even, exist, form, time, action, social, right, state, make, also, mean, man, kill, come, human, take, god, develop, new, without, truth, go, year, children, fact, hand, question, govern, thing, see, word, practic, murder, societi, bishop, believ, world, death, way, church 
Table 7.
Words present only in intent or no intent documents.
IntentNo Intent
back, two, defend, unborn, around, still, abortionist, morel, left, book, legal, lord, follow, camp, front, open, return, sever, old, reason, found, ask, justice mass, exist, form, develop, fact, question, bishop, understand, present, give, support, show, import, love, act, whole, name, end, object, begin, let, direct, accept, duty 
IntentNo Intent
back, two, defend, unborn, around, still, abortionist, morel, left, book, legal, lord, follow, camp, front, open, return, sever, old, reason, found, ask, justice mass, exist, form, develop, fact, question, bishop, understand, present, give, support, show, import, love, act, whole, name, end, object, begin, let, direct, accept, duty 

A meticulous examination of texts from the intent category leaves a distinct impression that these writings serve as a fervent call to arms against perceived injustice. The authors cast themselves as crusaders, solitary defenders of the vulnerable, viewing their actions as a moral imperative in a world where authorities and the legal system fall short in safeguarding what—or whom—they should protect. While similar sentiments are present in the no intent writings, they are notably less pronounced. This contrast is reflected in the lists of words with assigned TF-IDF scores, indicating the importance of each word in the corpus. For instance, the word chunk “polit” holds the top position for the no intent class in Table 6.

Words such as “law”, “govern”, “duty”, “defend”, or “protect” are prevalent in intent documents, a pattern not mirrored in the texts of the no intent class. Furthermore, words like “defend,” “legal,” or “justice” carry substantial weights in the intent list and near-zero weights in the no intent list. An intriguing revelation is the heightened expression of emotions related to care in perpetrators, often employing the term “unborn,” which is conspicuously absent in the other category of documents. Another cluster of words pertains to religion, aligning with Scott Atran’s argument that religion can act as either a buffer or a booster of radicalization and conflict. In this context, a distinctive pattern emerges: authors of intent documents articulate their views on religion more personally, frequently using words like “god” or “believe,” whereas writers of texts with no harmful intentions predominantly discuss more institutional aspects of organized cults, frequently using “church” and almost exclusively using “bishop” (with near-zero frequency in doers’ writings).

In summary, the writings of perpetrators exhibit heightened emotions and passion, evident in their frequent use of words like “murder” or “kill,” while no intent texts maintain a more balanced, unemotional, and reserved tone. This observation prompts further exploration in future studies, as the emotional disparity, not immediately apparent in casual reading, becomes evident through analysis. It is reasonable to hypothesize that individuals whose core emotions are stirred by perceived injustice may be more inclined to act on their radical beliefs, a hypothesis that warrants further empirical investigation.

4.6. Summary of the results and ethical implication

The objective of this study was to investigate the feasibility of predicting the intention of a violent act from textual content using ML. The results presented in Tables 3 and 4 suggest that ML models have the capacity, to some extent, to identify patterns that differentiate online posts from authors intending to undertake violent actions from those who do not. However, there is room for improvement, particularly in addressing false negative cases.

As previously noted, the prospect of automating intent detection from user-generated content has attracted attention in both academic and business settings. Nevertheless, automation extends beyond this domain. Various companies now offer tools to streamline portions of the hiring process or furnish predictive models for loan underwriting. In the healthcare sector, automated decision-making systems assist in clinical decision support. Barocas et al. (Barocas et al., 2023) list many issues that may arise around automated decision-making: essay grading software assigns lower scores to some demographic groups, hate speech detection was shown to discriminate against minority speakers, sentiment analysis tools assign varied scores to text based on names aligned with race or gender, computer vision technology performs differently based on gender, race, or skin tone.

These problems reveal that we should be very cautious when trying to design tools that can be subsequently used to automate decision-making. Of course, rarely do such systems function without some form of human judgment. Additionally, Barocas et al. (Barocas et al., 2023) provide reasons for cautious optimism regarding fairness in machine learning: data-driven decision-making can offer more transparency compared to human decision-making, demanding clear articulation of objectives while preventing concealing true intentions, thus facilitating more effective debates on the fairness of various policies and decision-making procedures.

In the context of our intent detection algorithm, it can serve as an additional tool to assist human analysts in prioritizing and flagging content for further review. In an ideal scenario, this tool could assist law enforcement and security agencies in early threat identification, facilitating more effective preventive measures. Additionally, it has the potential to enhance aspects of individual cybersecurity. Platforms hosting user-generated content may leverage such tools to identify and moderate potentially harmful content, contributing to the creation of safer online spaces. The algorithm could also play a role in intelligence and counterterrorism efforts by aiding in identifying radicalized individuals or groups posing security risks. Identifying and monitoring potential threats is invaluable for society. Moreover, it can help individuals expressing violent tendencies in their writings by allowing for early intervention through counseling or mental health support. In academic settings, researchers studying extremism and radicalization see the potential for such tools to provide valuable data for understanding language and patterns associated with violent ideologies, aiding academic research.

However, acknowledging that we don’t live in an ideal world, it’s crucial to address ethical concerns and potential negative consequences associated with these tools. Balancing security and individual rights, ensuring fairness, accountability, and addressing biases are critical considerations when evaluating algorithms predicting violent behavior based on text. A primary concern is the potential for bias in developing such algorithms, reflecting societal biases and leading to unfair targeting of specific groups or individuals. Cultural sensitivity and contextual understanding may be lacking, resulting in misjudgments. Those flagged by the algorithm as potentially violent may face unwarranted scrutiny, even if they never engage in violent behavior. Moreover, language is complex and dynamic, and the meaning of words can change based on context. An algorithm may struggle to accurately interpret the nuances of language and context, leading to misinterpretations.

Despite concerns, the continuous development of various tools and AI is inevitable. Our society must consider countermeasures to ensure the secure deployment of tools predicting violent behavior. Ensuring transparency in development and deployment, providing clear information about purpose, limitations, and potential biases, is crucial. Developers should regularly assess and audit the algorithm for biases and unintended consequences, and refine it accordingly. Consultation with experts from diverse backgrounds, such as linguistics and psychology, is necessary to understand language nuances and potential biases. The algorithm should evolve based on feedback, changing language usage, and new research findings.

For a tool assisting human decision-making rather than a definitive predictor of violence, human oversight is essential. Trained analysts should review and interpret results before taking any significant actions. Establishing clear protocols for human intervention and escalation is a crucial initial step. A more long-term approach involves developing a robust legal and ethical framework aligned with human rights principles and legal standards. This framework should include mechanisms for individuals who believe they have been unfairly targeted or affected by the algorithm to appeal decisions and correct errors in predictions.

Implementing these countermeasures requires collaboration between technologists, ethicists, policymakers, and affected communities to strike a balance between security concerns and the protection of individual rights and privacy. Regular evaluations and adjustments are crucial for the ongoing ethical use of such tools.

In this study, we sought to determine the feasibility of employing ML models to predict the intent of violent actions based on textual radical content. Our dataset comprised 70 documents labelled as either intent (indicating subsequent radical actions) or no intent (denoting the absence of radical actions by the author).

We employed two distinct text vectorization techniques—TF-IDF and the BERT language model. With the BERT model, documents were segmented into chunks of uniform length, and the embeddings of these chunks were utilised for analysis. Two classification approaches were explored: one entailed constructing a single embedding vector for each document by averaging the embeddings of its chunks, while the other involved training ML models directly on the embeddings of individual chunks. In the latter case, the majority voting technique was applied to make the final prediction for the document.

The results of our evaluation demonstrate the efficacy of ML models in predicting the intent of violent actions from textual content. Notably, the models exhibited greater accuracy when predicting document labels based on a single embedding rather than a list of embeddings generated for its constituent parts.

Despite the promising results, this work has several limitations that need to be acknowledged and potentially addressed in future research. Firstly, the sample size collected for this study is relatively small, which may limit the generalizability of the machine learning models. Secondly, as seen in Table 1, the corpus used in the experiments includes articles written by the same authors. Specifically, in the Anarchists category, there are 30 documents authored by only 6 individuals. This introduces a risk of data leakage, as the models are trained and tested on data originating from the same authors. With a much larger dataset, it would be advisable to perform cross-validation by stratifying the authors rather than the documents to ensure a more robust evaluation.

Looking ahead, our future work will delve into several avenues of exploration: (1) performing structural analysis of the documents trying to identify any patterns differentiating documents across the two categories (2) investigating the potential improvement in model performance by incorporating datasets from diverse domains, such as marketing; (3) exploring more sophisticated techniques for classifying documents based on predictions made for document chunks; (4) incorporating additional features extracted from documents, such as emotions or sentiment, during ML model training; and (5) exploring advanced explainability techniques to enhance the interpretability of model predictions.

Contributed to conception and design: AJ-L

Contributed to acquisition of data: AS

Contributed to analysis and interpretation of data: AJ-L, AS

Drafted and/or revised the article: AJ-L, AS

Approved the submitted version for publication: AJ-L, AS

There is no funding related to this work.

There is no competing interests for either author.

All texts used in this project are available on the Army of God website (https://www.armyofgod.com/) (PLEASE NOTE THAT THE WEBSITE CONTAINS DRASTIC IMAGES) and the Marxist Internet Archive (https://www.​marxists.​org/​archive/​index.htm). The code for data analysis can be obtained from the OSF platform using the link: https://rb.gy/mhd337

1.

https://www.nytimes.com/2009/06/01/us/01tiller.html

2.

https://en.wikipedia.org/wiki/United_States_Holocaust_Memorial_Museum_shooting

Aizawa, A. (2003). An information-theoretic perspective of tf–idf measures. Information Processing Management, 39(1), 45–65. https://doi.org/10.1016/s0306-4573(02)00021-3
Al-Hassan, A., Al-Dossari, H. (2019). Detection of hate speech in social networks: A survey on multilingual corpus. Computer Science Information Technology(CS IT), 10, 10–5121. https://doi.org/10.5121/csit.2019.90208
Barocas, S., Hardt, M., Narayanan, A. (2023). Fairness and machine learning: Limitations and opportunities. MIT Press.
Bisong, E. (2019). Google colaboratory. In Building Machine Learning and Deep Learning Models on Google Cloud Platform (pp. 59–64). Apress. https://doi.org/10.1007/978-1-4842-4470-8_7
Bonino, S. (2016). Violent and non-violent political islam in a global context. Political Studies Review, 16(1), 46–59. https://doi.org/10.1177/1478929916675123
Borum, R. (2011). Rethinking radicalization. Journal of Strategic Security, 4(4), 1–6. https://digitalcommons.usf.edu/jss/vol4/iss4/1/
Chen, Z., Liu, B., Hsu, M., Castellanos, M., Ghosh, R. (2017). Identifying intention posts in discussion forums. In Proceedings of the 2013 conference of the north american chapter of the association for computational linguistics: Human language technologies (pp. 1041–1050). https://doi.org/10.1007/s00500-017-2755-8
Cohen, K., Johansson, F., Kaati, L., Mork, J. C. (2013). Detecting linguistic markers for radical violence in social media. Terrorism and Political Violence, 26(1), 246–256. https://doi.org/10.1080/09546553.2014.849948
Conway, M. (2016). Determining the role of the internet in violent extremism and terrorism: Six suggestions for progressing research. Studies in Conflict Terrorism, 40(1), 77–98. https://doi.org/10.1080/1057610x.2016.1157408
Dai, H., Zhao, L., Nie, Z., Wen, J.-R., Wang, L., Li, Y. (2006). Detecting online commercial intention (OCI). Proceedings of the 15th International Conference on World Wide Web, 829–837. https://doi.org/10.1145/1135777.1135902
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv Preprint arXiv:1810.04805.
Dong, X., Yu, Z., Cao, W., Shi, Y., Ma, Q. (2019). A survey on ensemble learning. Frontiers of Computer Science, 14(2), 241–258. https://doi.org/10.1007/s11704-019-8208-z
Fernandez, M., Alani, H. (2021). Artificial intelligence and online extremism: Challenges and opportunities. Predictive Policing and Artificial Intelligence, 132–162. https://doi.org/10.4324/9780429265365-7
Fortuna, P., Nunes, S. (2018). A survey on automatic detection of hate speech in text. ACM Computing Surveys, 51(4), 1–30. https://doi.org/10.1145/3232676
Gaikwad, M., Ahirrao, S., Phansalkar, S., Kotecha, K. (2021). Online extremism detection: A systematic literature review with emphasis on datasets, classification techniques, validation methods, and tools. IEEE Access, 9, 48364–48404. https://doi.org/10.1109/access.2021.3068313
Goodier, J. (2008). A dictionary of law enforcement. Reference Reviews, 22(3), 19–19. https://doi.org/10.1108/09504120810859666
Gøtzsche-Astrup, O. (2018). The time for causal designs: Review and evaluation of empirical support for mechanisms of political radicalisation. Aggression and Violent Behavior, 39, 90–99. https://doi.org/10.1016/j.avb.2018.02.003
Gupta, V., Varshney, D., Jhamtani, H., Kedia, D., Karwa, S. (2014). Identifying purchase intent from social posts. Proceedings of the International AAAI Conference on Web and Social Media, 8, 180–186. https://doi.org/10.1609/icwsm.v8i1.14505
Hollerit, B., Kröll, M., Strohmaier, M. (2013). Towards linking buyers and sellers: Detecting commercial intent on twitter. Proceedings of the 22nd International Conference on World Wide Web, 629–632. https://doi.org/10.1145/2487788.2488009
Kaati, L., Shrestha, A., Cohen, K. (2016). Linguistic analysis of lone offender manifestos. 2016 IEEE International Conference on Cybercrime and Computer Forensic (ICCCF), 1–8. https://doi.org/10.1109/icccf.2016.7740427
Kim, J.-K., Tur, G., Celikyilmaz, A., Cao, B., Wang, Y.-Y. (2016). Intent detection using semantically enriched word embeddings. 2016 IEEE Spoken Language Technology Workshop (SLT), 414–419. https://doi.org/10.1109/slt.2016.7846297
McCauley, C., Moskalenko, S. (2017). Understanding political radicalization: The two-pyramids model. American Psychologist, 72(3), 205–216. https://doi.org/10.1037/amp0000062
McInnes, L., Healy, J., Melville, J. (2018). Umap: Uniform manifold approximation and projection for dimension reduction (Version 3). arXiv. https://doi.org/10.48550/ARXIV.1802.03426
Nobles, A. L., Glenn, J. J., Kowsari, K., Teachman, B. A., Barnes, L. E. (2018). Identification of imminent suicide risk among young adults using text messages. Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, 1–11. https://doi.org/10.1145/3173574.3173987
Perry, D. L. (2018). The global muslim brotherhood in britain: Non-violent islamist extremism and the battle of ideas. Routledge. https://doi.org/10.4324/9781315122144
Reid Meloy, J., Hoffmann, J., Guldimann, A., James, D. (2011). The role of warning behaviors in threat assessment: An exploration and suggested typology. Behavioral Sciences the Law, 30(3), 256–279. https://doi.org/10.1002/bsl.999
Simons, B., Skillicorn, D. B. (2020). A bootstrapped model to detect abuse and intent in white supremacist corpora. 2020 IEEE International Conference on Intelligence and Security Informatics (ISI), 1–6. https://doi.org/10.1109/isi49825.2020.9280551
Wang, J., Cong, G., Zhao, X., Li, X. (2015). Mining user intents in twitter: A semi-supervised approach to inferring intent categories for tweets. Proceedings of the AAAI Conference on Artificial Intelligence, 29(1). https://doi.org/10.1609/aaai.v29i1.9196
This is an open access article distributed under the terms of the Creative Commons Attribution License (4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Supplementary data