Israel-Diaspora Relations

STUDY: AMERICAN SERMONS & ISRAELI POLITICS

A comprehensive study examined the political discourse in synagogue sermons in the United States in recent years.

By: Dr. Ghila Amati, Shlomi Bereznik

STUDY: AMERICAN SERMONS & ISRAELI POLITICS

STUDY: AMERICAN SERMONS & ISRAELI POLITICS

Appendix 1

Methodology

Data Collection

This study analyzes a corpus of 4,302 sermons, 2,556 of which were delivered between October 2021 and October 2024 across Modern Orthodox, Conservative, and Reform synagogues in the United States. The distribution of sermons across denominations reflects both the varying levels of digital accessibility and availability, as well as the overall proportion of Jews affiliated with each denomination. The dataset includes 1,276 sermons from Reform synagogues, 878 from Conservative synagogues, and 412 from Modern Orthodox synagogues, with the majority of Modern Orthodox sermons originating in the past two years.

The graphs presented here offer insights into the geographical distribution of sermons and the broader trends in sermon documentation over time.

The first chart illustrates the percentage of sermons from 2006 to 2024 categorized by four major geographical regions: New York, the West Coast, the East Coast (excluding New York), and the Midwest/Southeast.

New York has the largest share, with 40% of recorded sermons. The East Coast and the Midwest/Southeast regions follow with 22% and 21%, respectively, and the West Coast accounts for 16%. The high proportion of sermons from New York corresponds to its large Jewish population, where many congregations, synagogues, and religious institutions are based.

The second graph focuses on the period between October 2021 and October 2024, showing a slightly different distribution.

During this timeframe, New York’s share of sermons increases to 43%, while the share from the East Coast decreases to 11%. The Midwest/Southeast region accounts for 25%, and the West Coast for 21%. This snapshot offers a more recent look at the geographical spread of sermons within the defined timeframe, which is the source of most of the following analysis.

The third graph illustrates the number of sermons uploaded each year from 2006 to 2024.

The data shows a gradual increase in the early years, with relatively low numbers of uploads until around 2016. After that point, the number of sermons uploaded each year rises significantly, reaching its highest point in 2023. The decrease in 2024 is because the sermons included in the dataset only extend until October 2024, meaning there are two fewer months of data compared to previous years. The overall trend indicates an expansion in the documentation of sermons over time.

For the Reform and Conservative denominations, the sermons were primarily collected using a Python-based API, which transcribed sermons directly from YouTube. This process involved extracting sermons from designated playlists specifically curated for this purpose. The API facilitated the automated retrieval of key metadata, including sermon titles, full transcriptions, and timestamps. In leveraging this method, the study ensured a systematic and efficient collection of sermon data, allowing for a comprehensive analysis of political discourse across these denominations.

By contrast, the collection of Modern Orthodox sermons required a more direct approach, given the relatively limited online availability of sermon content. These sermons were obtained through direct outreach to rabbis and communal leaders and were provided in diverse formats, including PDF and Word documents. To ensure uniformity, all Modern Orthodox sermons were first converted to PDF format and subsequently processed using Python libraries to extract text and structure them into a standardized CSV file, containing sermon title, full text, date, and rabbi’s name. In cases where sermons lacked explicit Gregorian dates or contained only Hebrew calendar references, ChatGPT-4o was employed to determine at least approximate dates, ensuring temporal consistency across the dataset. This standardization enabled the comparison of sermons across different time periods.

To maintain consistency across denominations, all collected sermons were consolidated into a unified CSV format. This step facilitated efficient data management and ensured comparability across sermons from different Jewish traditions.

Analytical Stage

Methodological Framework: Computational Discourse Analysis and AI Optimization

In conducting this study, a range of advanced computational and linguistic methodologies were employed to enhance the precision, coherence, and analytical depth of the data. Given the complexity of homiletic discourse and its intersection with political themes, it was imperative to structure the interaction between AI and text in a way that maximized interpretability while preserving the nuanced rhetorical elements of the sermons. This was achieved through a combination of prompt engineering strategies, structured analytical workflows, and AI-driven reasoning techniquest that enabled a deeper engagement with the material.

Few-Shot Learning: Guiding the Model through Exemplars

To refine the model’s interpretative accuracy, a few-shot learning approach was taken. Rather than presenting AI with decontextualized queries, multiple examples of expected response structures were embedded within the prompts, guiding the model toward a more coherent and contextually attuned analysis.

Chain of Thought (CoT) Reasoning: Encouraging Sequential Interpretive Logic

Recognizing the layered rhetorical structure of sermons, the study integrated Chain of Thought (CoT) prompting, a technique that compels the AI to articulate its reasoning process step by step before generating a final response. By structuring responses in a way that mirrored human-like interpretative sequencing, this approach ensured that the AI’s analyses mimicked the kind of structured reasoning employed in human discourse analysis.

Decomposed Prompting: Segmenting Interpretative Tasks for Greater Precision

Rather than employing a conventional single-query approach, this study implemented decomposed prompting, allowing for a more precise and structured analysis of homiletic discourse. Instead of tasking AI with identifying and interpreting political content simultaneously, the analytical process was broken into distinct stages.

Individualized Analysis: Preserving Contextual Integrity

As sermons operate as discrete rhetorical events, each text was analyzed individually, rather than in bulk. This methodological decision was informed by the recognition that the homiletic voice is context-dependent – its themes, audience reception, and rhetorical function vary significantly based on denomination, location, and sociopolitical climate. By treating each sermon as an autonomous unit of analysis, the study ensured that contextual integrity was preserved, avoiding dilution of meaning through aggregated processing.

Structured Computational Output: Standardizing Discourse for Systematic Analysis

To facilitate a methodologically rigorous approach to AI-generated insights, responses were formatted in a structured JSON schema, ensuring standardization across all stages of computational processing. This structure not only enabled efficient parsing and categorization but also allowed for cross-referencing between AI-generated analyses and human interpretative oversight.

Stages of Analysis

The analytical process was structured into multiple stages to ensure a comprehensive and systematic examination of the homiletic discourse. The methodology employed a combination of AI-driven classification, sentiment analysis, and structural evaluation to extract insights regarding political content, critique of Israel, positive messaging, and thematic trends.

General Analysis of Sermons

The initial stage of analysis encompassed a broad examination of the entire dataset to establish foundational classifications and ensure methodological consistency. The first step involved verification that each analyzed text was indeed a sermon, distinguishing it from other forms of discourse. Once confirmed, the classification process extended to identifying whether a sermon contained political content, sorting sermons into political or non-political categories. This classification was further refined by determining the number of political sermons that focused on Israeli politics.

Beyond identifying political themes, in cases where political references to Israel were apparent, an additional layer of evaluation was introduced to assess the overall sentiment conveyed. This approach allowed for a nuanced understanding of how political themes, particularly those related to Israel, were framed within different denominational contexts, enabling a comparative analysis of tone and emphasis across the dataset.

Classification of Sermons Containing Criticism of Israel

To systematically identify criticism directed at Israel in sermons, we defined a set of specific, predefined critique themes. Instead of broadly asking whether a sermon contained criticism, we instructed ChatGPT-4o to assess each sermon individually for the presence of these specific themes. This approach ensured a targeted, structured classification, reducing ambiguity and improving precision. AI was prompted to evaluate sermons already identified as engaging with Israel and determine whether any of the predefined critique themes appeared within them.

This methodology provided a more granular analysis, enabling us to compare how criticism of Israel varied across denominations and over time. By structuring the task in this way, we ensured that each critique theme was independently validated, providing a more rigorous assessment of the political discourse within Jewish sermons. This classification was applied only to sermons previously identified as political and engaging with Israel.

Structural Analysis of Sermons

To examine the structural role of political discourse within sermons, a segmented content analysis was conducted to determine how and when political themes emerged within the homiletic framework. Politically focused sermons were analyzed with particular attention to their introductory sections, with the first 20% of each sermon extracted for closer examination. This method provided insight into whether political discourse was introduced at the outset or emerged later within a broader religious or moral discussion. ChatGPT-4o was tasked with identifying these structural patterns, distinguishing sermons that began with explicit political themes from those that opened with religious motifs. By supplying the model with structured examples of each sermon type, a consistent classification framework was maintained, ensuring a rigorous and systematic approach to evaluating the interplay between political and religious discourse.

Classification of Positive Messages in Sermons

The classification of positive messages in sermons was conducted across the entire dataset to identify expressions of encouragement, unity, and calls to action. ChatGPT-4o was tasked with recognizing instances of positive messaging within each sermon and was subsequently instructed to return a structured list of these messages based on predefined categories. The analysis focused on general positive messages, which encompassed broad themes of encouragement and communal solidarity.

Analysis of Specific Topics: Hostages, Ceasefire, and Aliyah

To further refine the examination of political discourse in sermons, additional targeted analyses were conducted on specific themes of contemporary significance. The topics of hostages and ceasefire were given particular focus due to their post-October 7 relevance. These themes were analyzed using a topic-based volume assessment, which evaluated their prominence within the sermons that focused on Israel, based on ChatGPT-4o’s topic modeling capabilities. This approach allowed for an assessment of how frequently these themes appeared. By contrast, the topic of Aliyah, Jewish immigration to Israel, was examined through word frequency analysis, given its more straightforward linguistic markers. This method enabled a precise measurement of the presence of the word Aliyah across different denominations, providing insight into whether religious leaders engaged with the theme of Jewish migration in their sermons.

Refining the Analytical Model

This multi-layered approach ensured that the study went beyond basic keyword detection, incorporating thematic nuance, discourse structure, and sentiment shifts over time. The integration of AI-assisted content categorization with structured human oversight provided a methodologically rigorous framework, allowing for a detailed examination of political discourse within contemporary Jewish sermons.

Validation Techniques

A human-in-the-loop (HITL) validation approach was implemented to ensure the accuracy and reliability of the AI-driven analysis. This validation process combined human expertise with computational analysis, aiming to mitigate biases and enhance credibility. Rather than relying solely on automated classification, human reviewers systematically evaluated a sample of AI-generated classifications and compared them to manual interpretations to ensure consistency. Additionally, ChatGPT-4o’s reasoning process was examined to gain insight into how the model arrived at its classifications. This step was crucial in refining the AI’s interpretative accuracy and ensuring that its decision-making aligned with the nuanced nature of sermonic discourse.

To assess the performance of ChatGPT-4o in classifying sermons, standard validation metrics – accuracy, precision, and recall – were applied. These measures provided insights into the model’s strengths and limitations across different analytical tasks.

Accuracy provides a measure of how often the model correctly classifies both political and non-political sermons, incorporating both correct positive and negative classifications. A high accuracy score indicates that the AI is effective in distinguishing between relevant and irrelevant sermons. However, accuracy alone is not always sufficient, particularly in cases where the dataset is imbalanced, and certain classifications are less frequent.

To address this limitation, precision was employed as a complementary metric, focusing on the reliability of the AI’s positive classifications. Precision evaluates how many of the sermons identified as politically charged or containing criticism are indeed correctly classified. This metric is particularly important when identifying rare or highly specific classifications, such as sermons containing political critiques of a particular figure. A high precision score ensures that when AI classifies a sermon as political or critical, it is more likely to be accurate and not a false positive.

For instance, in a dataset containing both political and non-political sermons, accuracy would measure how well the model distinguishes between the two categories overall. However, in cases where only a small subset of sermons contains political criticism, precision ensures that when AI does flag a sermon as politically critical, it is indeed correct. In integrating both accuracy and precision into the validation framework, this study ensures a balanced and rigorous evaluation of AI-driven classifications, reinforcing confidence in the reliability of the computational analysis of religious discourse.

However, focusing solely on precision presents its own challenges. A model optimized for precision may become overly cautious, avoiding misclassification at the cost of overlooking relevant instances. This is where recall becomes essential. Recall measures the model’s ability to retrieve all relevant cases within the dataset, quantifying how many politically charged or critical sermons were correctly identified out of the total that exist. A high recall score indicates that the AI is not merely accurate when it does classify sermons as political but is also comprehensive in ensuring that it does not miss important instances of political discourse.

These validation measures aimed to mitigate biases and enhance the credibility of AI-driven insights, ensuring that the analysis aligns with the complex and nuanced nature of religious sermons.

By integrating computational analysis and human expertise, this study aims to provide a comprehensive examination of political discourse in American Jewish sermons across diverse denominations and socio-political contexts.

PreviousNext