Poetic Language and the Computer : A Corpus Stylistic Study of The Waste Land

This paper presents a corpus-stylistic analysis of T.S.Eliot’s poem The Waste Land, henceforth (TWL). The poem will be analyzed according to the corpus stylistic approach. Baker, Hardie, and McEnery (2006, p.48-49) define the word corpus (plural "corpora") is a Latin word that refers to a combination of written and spoken texts. The word corpus also has the meaning of "body". The texts are generally stored and analyzed electronically. Corpus stylistic starting to appear in the discipline of linguistics has become increasingly popular in recent years. It is a combined process to the study of the language of literature (for more details about corpus stylistics see Ho, 2011; Mahlberg, 2014). Hunston (2002, p. 2–3) pinpoints that "Although a corpus does not contain new information about language by using software packages which process data, we can obtain a new perspective on the familiar". To put it differently, corpora reflect the real authentic language and by using quantitative as well as qualitative methodology in terms of integrating manual and corpus analysis one can get interesting facts about patterns of language.

quantitative methods (p.378-380). Moreover, Wynne (2005) pointed out some underlying similarities of approach: "Traditional and computational forms of stylistics have more in common than is obvious at first sight. Both rely on the close analysis of texts, and both benefit from opportunities for comparison" (p.677-679). Mahlberg & McIntyre (2011, p.205) ascertain that corpus stylistics does not substitute the qualitative analyses, but rather foster the computational analyses. Furthermore, they argued rightly, that "corpus methods can aid the identification of elements of a text worthy of further qualitative analysis" (p.223). In this regard, computer programs can aid researchers immensely as Rayson (2003, p.4) pinpoints that "Although the computer saves us time with its processing of the texts into frequency lists, it presents us with so much information that we need a filtering mechanism to pick out significant items before the analysis can proceed".

METHODOLOGY
The following methodological steps are followed to conduct this study: 1-Preparing the corpus of the selected text and converting it into a plain text form (txt).
2-Using WebCorp Live as the tool to examine Keywords by focusing on content words. This tool enables researchers from isolating function words like (articles, prepositions, pronouns …etc) in order to give the needed attention to the words that carry meaning.
3-Using Wmatrix software to benefit from its unique advantage of identifying key semantic domains. Brooke, Hammond, & Hirst (2015) conducted research seeking distinguished voices in The Waste Land. Their major aim is to find out whether the computer can locate different voices as in human analysis (p.4). Similar to the current study, Jaafar (2019) analyzed Seamus Heaney's poem "A Herbal" by means of computational tools. It is concluded that the integration of both quantitative and qualitative approaches can be fruitful in terms of obtained objective results. In contrast, Jaafar (2014) conducted a manual qualitative study to analyse selected poems that focuses on deviation and other stylistic tools.

TYPES OF CORPUS TOOLS
In fact, there are a variety of corpus tools or software programs that can be useful in conducting corpus stylistic research. Researchers should have an idea about such tools and then they can select the suitable one for their analysis. Some of these toolkits, for example, WordSmith tools which is a word list tool, a concordance tool, and a keyword list tool. Another web-based corpus software Wmatrix ( Rayson 2009) which is distinguished from other tools by its unique feature of identifying key semantic domains as well as having the function to automatic POS tagging. AntConc (Anthony 2004) is free software that can be used freely by anyone who has an interest in finding n-grams, keywords, concordances, and collocations. Moreover, Mahlberg & Smith's (2012) tool CLiC (Corpus Linguistics in Cheshire). One more beneficial tool is WebCorp Live is a search engine for linguists, teachers, and learners that is useful to retrieve corpus data online.

THE DATA
The Wasteland is considered T.S.Eliot's masterpiece written in 1922. It is one of the complicated poems of the twentieth century. It depicts the destructed life after the First World War. Not one this, but also it shows the decline of moral values of the society after the war. It is an obscure poem due to the inclusion of many allusions and references from other texts. It was written he underwent personal difficulties in his first marriage. Its complexity made it one of the crucial and significant works of the English literature and in particular of the modern age, which contain some lines from a variety of languages such as German, French, Spanish and Hindi. ( see, North , 2001;McHale & Stevenson 2006).
The Waste Land is a long poem consists of a total 3,028 words and divided into five sections. The main ideas and themes of the poem include; death, corruption and many other pessimistic views of the life postwar. It is considered the mouthpiece of that time. However, the last line of the poems shows a different shift. The Sanskrit word 'Shanith' means peace' is repeated three times.
The first section "The Burial of the Dead" contains four different characters each one has a story that is not completed. The language of the poem contains some verses written in both German and French which add difficulty in understanding and interpreting the text.
Section two "A Game of Chess" narrates stories of women of different social classes the high and the lower class. These women have their own dilemma. This section consists of 261 words. Section Three The Fire Sermon is the longest section of the poem. Section four Death by Water. This the shortest section of the poem. Section five "What the Thunder Said" is pessimistic and shows a harsh reality in terms of the imagery of destruction, there is a temporal renewal but again the destruction is near and final.

The Analysis
Roughly speaking, under the umbrella of qualitative analysis, the poem depicts boredom and monotony of the modern life, which illustrated by means of continuous repetitions of words, phrases, and sentences throughout the text, examples include: "My nerves are bad tonight. Yes, bad. Stay with me. "Speak to me. Why do you never speak. Speak. "What are you thinking of? What thinking? What? "I never know what you are thinking. Think."

I think we are in rats' alley
Where the dead men lost their bones.
"What is that noise?" The wind under the door.
"What is that noise now? What is the wind doing?" Nothing again nothing. "Do "You know nothing? Do you see nothing? Do you remember "Nothing?

" ( Section II A Game of Chess) https://www.poetryfoundation.org/poems/47311/the-waste-land
These lines give a clear idea about the nothingness and the hollowness of the people of the modern life.
In addition to that, T.S. Eliot uses allusions and includes many images and references to famous works and personalities in the text. He creates ideas and leaves them unfinished which in turns gives the complexity of the poem. Another important aspect of the poem is the inclusion of many voices or characters by using personal pronouns like ( she, he, I , you). This can also be the reason behind the difficulty of the poem.
The poem consists of a total 3,028 words. According to WebCorp Live calculation, the number of content words is 1034 and the total number with repetitions throughout the text is 1552.

Screenshot (1) Wordlist in The Wasteland
Screenshot (1) illustrates that by copying and pasting the text in the specific area the generator of the wordlist count the words according to the choices made by the researcher, for example, the ngram size ( single words) and the option filter out stop words or high-frequency words( skipping function words from the retrieval results).
In the first section of the text The Burial of the Dead, contains (252) words repeated once one and excluding high-frequency words like (articles and pronouns and other function words). The most repeated word is 'Dead' which is mentioned five times only in this part.

Screenshot (2) the Retrieval Results of the Word 'Dead'
The word 'dead' for example, occurs (10) times throughout the text. Here are some examples: 1. Dead April. 2. Dead land. 3. Dead tree. 4. Dead sound. 5. Dead mountain. These words mentioned with the word 'dead' to give them a kind of pessimistic personification, related to the pessimistic atmosphere of the poem. In addition to that, part one contains some verses written in both German and French.

Keywords Analysis
Wmatrix is employed to identify the main keywords in the TWL. The selected reference corpora is BNC which contain 100 million words, 90 % percent consists of written data the rest 10% percent of spoken data. Here, is this paper The BNC sampler (1 million words) consists of many categories. The BNC Written Imaginative sampler is used as a reference corpus. This sampler contains literary works and creative writing. It is necessary to use a reference corpus that suits the content of the text to be analyzed and examined (cf Culpeper, 2009, McIntyre 2010.

6.2.Key Semantic Domains
Semantic domains enable readers/researchers locate semantic categories of words. Wmatrix has this unique tool of gathering words under suitable domains. In this way, it facilitates examining the dominant group of words to understand thematic ideas.

Screenshot (4) Key semantic domains in The Waste Land
Screenshot (4) shows the most significant key semantic domains in the poem. Words with larger fonts have more importance than words with small fonts. Key domains help to give readers an idea of the main points and ideas proposed by the poet.
The most prominent semantic domains in the poem are:

Conclusion
Analyzing literary text manually (qualitatively) has a noticeable significance, but analyzing text by integrating both qualitative and quantitative approaches lead to much more significant and accurate results. The numbers and calculations retrieved by the tools of the computer enable researchers to focus on details that can be missed by manual analysis. It is, thus, crucial to merge both methodologies to reach an accurate and objective interpretation of any literary text.