1976. Stuttgart - Theory and Practice in Analysing Discourse.

By C. George SANDULESCU, Stockholm.

(Paper given by C. George Sandulescu  at the Stuttgart Congress of Applied Linguistics in 1976, and published in the Proceedings of the Fourth International Congress of Applied Linguistics, pages 349 to 365.)   

                                                [Full Text]


                                    Language is a labyrinth of paths. 
                                    You approach from one side and
   know your way about; 
                                    you approach
 the same place from another side 
                                    and no longer know your way about.

                                                    Wittgenstein, Philosophical Investigations, 203

1. Introductory. Trying to establish points of comparison -- that is, similarity and dissimilarity -- between two recent approaches to the study of language is an enterprise fraught with the dangers of both oversimplification and rash generalization. But in the hope that the advantages of such a correlation study across both geographical and terminological boundaries will far outweigh the risks, the following critical, and even polemical, remarks are considered necessary at this early stage of development of the subdiscipline.

            The tremendous recent interest in language units larger than the sentence is just another 'explosion' in the wake of the peaceful 'population' , 'information' , and 'book' explosions we have all witnessed in the last 30 years as follow-ups of the 'revolutions' of 50 or 250 years ago. Linguistics as a rigorously scientific discipline is therefore called upon to cope with what we are here calling the 'text/discourse explosion'. Given the almost instantaneous outburst of attention, the major pitfall has been investigatory parallelism, rather than concerted effort; and it is this very attention paid to units larger than the sentence that has triggered substantial research emphasis on pragmatics.

            2. The issue of terminology.  Terminological discrepancy is directly derived from (a) investigatory parallelism, and (b) deliberate disregard of similar (or even identical) research results obtained elsewhere (usually beyond national boundaries).

Consequently, the first question to be asked, before passing on to illustrations, is how important is the issue of terminology. In the opinion of Roman Jakobson (1974, 1975), an opinion which we fully share, the question of terminology is highly important in present-day linguistics (for example French traits pertinents being a mistranslation of the original distinctive features).

PROPOSITION 1:  What is called Discourse in English-mediated research is called Text on the Continent (Germany, Russia, Scandinavia).

            Terminological discrepancy has far-reaching theoretical implications: an even cursory look at any recent dictionary of linguistics would support this view. By way of illustration, here are the title words of 43 different entries under Text of the latest German dictionary (as against only one in the Dictionnaire de linguistique, Larousse 1973):

            (Lewandowski 1975 : 733ff)

  1.         TEXT

  2.         TextANALYSE

  3.         TextANFANG

  4.         TextÄUSSERUNG

  5.         TextBASIS

  6.         TextBEGRENZUNG

  7.        TextEINHEIT

  8.         TEXTEM

  9.          TextENDE

  10.         TextERWARTUNG

  11.         TextERZEUGUNG

  12.         TextEXEMPLAR

  13.         TextFORMULAR

  14.         TextFUNKTION

  15.         TextGEMEINSCHAFT

  16.         TextGRAMMATIK

  17.         TextHAFTIGHEIT

  18.         TextINTERPRETATION

  19.         TextISOTOPIE

  20.         TextKOHÄRENZ

  21.         TextKONNEXION

  22.         TextKONSTITUENTEN

  23.         TextKONSTITUTION

  24.         TextKRITIK

  25.         TextLINGUISTIK

  26.         TextPARTITUR

  27.         TextPHORIK

  28.         TextPRAGMATIK

  29.         TextREFERENZ

  30.         TextREPRODUKTION

  31.         TextREZEPTION

  32.         TextSEMANTIK

  33.         TextSORTEN

  34.         TextSTRUKTUR

  35.         TextTEMA

  36.         TextTHEORIE

  37.         TextTIEFENSTRUKTUR

  38.         TextTYPOLOGIE

  39.         TEXTUALITÄT

  40.         textuelle BEDEUTUNG

  41.         TextVERARBEITUNG

  42.         TextVERSTEHEN

  43.         TextVERWEIS.

            Considering such terminological discrepancy as a point of dissimilarity at the meta-level, but as ultimately instancing similarity at the level of the object language (the language segments under examination being identical as to length and extent),  a minute examination of the existing books, articles, research reports, and conference papers classifiable under the key words Text / Discourse shows the following clearcut polarization of focus of research.

            3. Two fundamental directions of research. The essence of the difference resides in overemphasis on the theoretical model in one case, and a plethora of field data in the other case. For operational as well as alliterative reasons, let us pitch Bielefeld against Birmingham (though not so much Konstanz against California), and survey research performed by the groups there, particularly as crystallized in the shape of two recent books -- Sinclair & Coulthard (1975) and Petöfi & Rieser, eds. (1973).  One of the crucial statements to be made here, perhaps vulgarly derived from the fact that the two directions of research do not quote each other at all, is the following:

PROPOSITION 2: Whereas TextLinguistics TL  is definitely a model-oriented approach to language units larger than the sentence, Discourse Analysis DA evinces many of the characteristics of a data-centered approach to much the same language units.

PROPOSITION 3: A data-centred approach to discourse means in the last analysis that a significant research result is obtained as an outcome of a study of a collection of instances, especially if these have been classified. (An excellent illustration of that statement is provided by Sinclair & Coulthard 1975).

PROPOSITION 4: A model-oriented approach to discourse takes as a research result only a generalization, which can be evaluated against rival generalizations, whether based upon many or upon few instances. (An excellent illustration of that statement is provided by Petöfi and Rieser, eds. 1973).

        Alternative (or complementary) ways of proving the correctness of the above statements are provided both via quantification of data in already published material and the qualitative assessment of the nature of such data (cf Section 5).

4. The status of certain theoretical constructs. The theoretical construct text grammar, which characterizes most research on the Continent, postulates (rather than demonstrates) that texts evince grammar in much the same way in which sentences evince grammar.

PROPOSITION 5: Whereas the discourse analyst has as his major goal the detection and description of discourse structure, the fundamental concern of the text linguist is "the formal set-up of an empirically adequate text grammar (Petöfi in Petöfi & Rieser eds.1973 : 1).

        This happens to be a very important point of difference between TL and DA in that the meta-category labelled grammar is assigned completely different areas of operation. Here first is one representative point of view regarding the status of this theoretical construct:

 ... "sentence" is regarded as the highest unit of grammar. Paragraphs have no grammatical structure; they consist of a series of sentences of any type and in any order.   [...]    there are no grammatical constraints [...] at the level of discourse. (Sinclair & Coulthard 1975 : 20)

        On the other hand, and pitched clearly against that statement, the text linguist makes it a professioin of faith to declare that --

The set-up of the text grammar is based on the hypothesis that text grammars can and should be regarded as generalized (and expanded) sentence grammars. (Petöfi in Petöfi & Rieser eds. 1973 : 8).

        Another highly controversial issue is that of directionality (cf eg Eliasson 1973); TL operates on the firm assumption that texts evince a deep structure in addition to their  linear and manifest surface structure:

... the sequence of operations [...] is not unidirectional, it leads, on the one hand, from deep structures to the semantic representation and, on the other hand, from deep structures to surface structures. This model accounts for the possible readings of the deep structure, but it does not explain the syntactical ambiguity of the linear manifestation nor does it allow to enumerate the admissible paraphrases of the semantic representation. (Petöfi in: Petöfi & Rieser eds. 1973 : 9)

        DA expresses doubts, for the time being at least, about the profitability of postulating a deep structure in delineating the systematic patterning of discourse: certain investigations completely replace deep structure by semantic interpretation, others advance the suggestion that it is far too early to make  pronouncements on matters of such capital importance, and which have still received insufficient empirical investigation.

PROPOSITION 6: The question of directionality (deep to surface versus surface to deep) is central in TL, and almost inexistent in DA.

5. The nature of the data. Having seen that everything distinguishes and differentiates DA from TL except the actual language span under investigation, it would seem that the two approaches deal with the same kind of data. 

PROPOSITION 7: Though the data happens to be identical, it is only the theoretical models imposed upon it that are widely different.

        That might have been true within the restricted frame of reference of sentence linguistics, but it is not at all valid within the far wider and considerably more complex frame of reference of discourse / text: it is the very nature of discourse data that provides the Wittgensteinian labyrinth.

        The history of linguistics is replete with instances of sudden changes of attitude as to what actually and ultimately is to be considered language data (cf Sandulescu 1975a : 128). For purposes of simplification again, we are going to divide discourse data into two major categories -- namely, we distinguish clearly between discourse data which is spontaneous and authentic (that is, not expressly produced for didactic or research purposes), or what we prefer to call (a) token data, and, at the other end of the scale, (b) type data. This distinction is fundamentally based on the type / token dichotomy, symmetricaly parallelled by the emic / etic filtration. Category (a) ultimately refers to researcher-external data, whereas category (b) covers researcher-internal data, clearly including the 'intuitive' variety; in other words --

PROPOSITION 8: Type data is competence data, whereas token data is performance data.

        This kind of explanation, however, would at once lead us to a discussion of the merits and demerits of one theoretical model over another; the theory is summarized, quite imperfectly, by the following dichotomies: (Saussure:)   Langue / Parole; (Hjelmslev:) System / Process; (Chomsky:) Competence / Performance; to which one might parallel  (Herdan:) Type / Token;  and (Pike:) Emic /Etic. 

        Dressler (1972 : 13) makes the extremely important and highly relevant point that a texteme is an emic text. Such a way of solving the issue would be quite misleading in that it shifts the focus of attention from the actual nature of the data to the ultimate essence of the theoretical model resorted to. This binary distinction is in consequencenot only an aspect of the model, but also affects the very nature of the data. It is a notorious way of doing away with 'redundant' discourse material such as silence fillers (the ultimate role of more or less deliberate interjections, hesitations and slip-of-the-tongue phenomena, and variable stretches of either silence or discontinuity within the semantic interpretation of discourse structure has not even been touched upon by current research; analysing the structure of spoken discourse, it is clearly noticed that many such phenomena do go down into the semantic interpretation and / or deep structure). 

        This whole discussion revives the questions:

  1. Is a competence model in any way superior to a performance model ?

  2. Does the addition of a pragmatic component to the model in any way modify the competence / performance balance ? 

        This is indeed an acute dilemma for the future of both TL and DA; thus the following three alternatives present themselves: 

PROPOSITION 9: The imposition of pragmatic information upon discourse structure automatically modifies, in the opinion of some, the nature of the theoretical model, turning it into a radically performance model; the latter, it is conceded, is not at all inferior to, but rather on a par with, the so familiar competence model of generative grammar. 

PROPOSITION 10: The emergence of a pragmatic component within the model carries with it the placement of competence and performance at the two extremes of a cline (or scale) separated (or united) by an infinity of intermediate stages variously called: communicative competence (Hymes); textuak competence (Van Dijk); rhetorical competence (Corder); abstract performance; performative competence; Performanzkompetenz (Harig & Kurz); peformance with zero competence (Gaeng); grammatical competence (Campbell & Wales); meta-competence; pragmatic competence (Wunderlich, Habermas), etc.

PROPOSITION 11: A clearcut distinction between competence and performance along the lines suggested by Chomsky (1965 : passim), in the steps of Saussure (without him ever formally acknowledging that !) is not exactly useful for the specific investigation of discourse structure and is liable to lead to crisis rather than to the straightforward solution of undecided issues;  in consequence, it should better be dropped.

        By way of illustration of this very last view here is how one particular research project formulates the decision-making processes involved in the selection of both field methods and research strategies:

...it was not possible to take over the distinction between competence and performance in any terms similar to its characteristic use in syntax. (Sinclair & Coulthard 1975 : 120)

6. Data authenticity: binarity or continuum ? The nature of linguistic data has not exactly been a favourite topic of discussion among linguists, particularly ever since Chomsky branded the notion of corpus. Avoiding it, however, we have distinguished between type data and token data, as two distinct categories of discourse material, both of them equally profitable (though hierarchically differentiated) particularly at the present stage of research into the nature of discourse, when we are still fumbling in our quest for the major theoretical constructs of the subdiscipline. Type data is but seldom given real data status in that it never evinces authenticity (and/or spontaneous- ness) and hardly ever meets data quality control requirements (cf Naroll 1962). In addition to 'intuitive' data, this category also includes the 'normative' data, as provided by, say, standard national dictionaries, and which represents the slightly more objectified material that is vital for a wide range of computational studies. Thus, it may be quite profitable to operate with type data at a level of description and investigation situated below the sentence. However, --

PROPOSITION 12: It is an illusion of profitability to operate with type data in advanced research at a level beyond the sentence.

        For it is particularly at this point in the investigation that token data becomes a necessity. By token data we understand spontaneous and authentic, externally and objectively validated data, but not necessarily belonging to a closed corpus. It is token data that is the direct outcome of (a) field work, and (b) active observation in discourse analysis, conversational linguistics, psycholinguistics, sociolinguistics, cultural (or social) anthropology, etc. It is also token data that lies at the basis of the most solid case studies in these fields.

        Transcriptive procedures of recorded field data provide means to operate variation on the cline between type and token in analysing discourse: all transcription of discourse is ultimately a process of abstraction with a lot of information that is left out, existing with equal rights by the side of the information that is put in, a process quite analogical with the shift from performance to competence. In a very rigorous sense, the only faithful 'transcript' of a recording is the recording itself.

PROPOSITION 13: No token data is a hundred per cent token data on the ideal scale, as it has already undergone a process of abstraction of one kind or another, which has brought it away from it token essence, and pushed it to some extent down the cline towards type data.

        Very often this abstraction push may be quite negligible; but in the analysis of discourse, the presuppositional patterns brought about by, say, segments of silence (usually neglected in the transcripts) may operate considerable modifications upon the semantic interpretation of the segment. The investigation of discourse silence, though not so far forming the habitual hunting ground of the linguist, may gradually be forced upon him by the increased significance acquired by semantic and pragmatic parameters over the purely syntactic ones (cf Sandulescu 1975b).

7. Spoken versus written. There is still another topic that requires attention as part of our TL/DA comparison, namely the mode of discourse (cf Spencer & Gregory eds 1964 : 87ff).

PROPOSITION 14: Whereas TL proceeds on the Scale of Mode (or Medium) of discourse from [- SPOKEN ], and hopefully towards [+ SPOKEN ], DA (with very few exceptions) proceeds on the same Scale of Mode (or Medium) from                [+ SPOKEN ], but hardly ever reaching [- SPOKEN ].

        In other words, there is distinct polarization on the mode dimension too, with TL almost exclusively focusing on the [- SPOKEN] end of the scale, and DA evincing a tendency to focus on the [+ SPOKEN ] end of it. in spite of the open professions of faith, the focus of interest is rather static, and statistically speaking, evinces but little genuine tendency to move up or down the mode scale. One fleeting justification of the fact that TL carries a [- SPOKEN ] focus of research could well be that the investigation of 'written discourse' implicitly carries with it a higher level of abstraction  and a considerably greater degree of logical organization; it is certainly closer to type data and lends itself more easily to giving a more pregnantly concrete image of what 'competence' really looks like.

8. Retrospects and prospects: sights and insights. The fundamental question which we have not yet managed to give a clear answer to is still the following: Are we faced with two distinct subdisciplines of linguistics, TL and DA, accidentally ignoring each other, or is it one and the same subdiscipline evolving along diverse lines in diverse research settings ? We tend to give a simple answer and say that it is one and the same subdiscipline dealing with language units larger than the sentence, be they spoken or written. A survey of the points of permanent difference and the points of possible rapprochement shows that the terminological gap is there partly because discourse has no suitable and adequate equivalent in several European languages, and text thereby provides a too facile and easy replacement, partly because hardly any language could match German string compounds...

        The theoretical attitudes, too, may well remain divergent: one aiming at lofty theory, the other quite content to stay in the lower spheres of practicality and practicability, and cater primarily for the further training of the average classroom teacher of the national language. But an important point of convergence is the fact that TL is quite aware of the acute need to expand -- both quantitatively and qualitatively -- the range of the data that is being taken in; sooner or later, basically on account of pressure from outside, it will manage to incorporate, we hope, an acceptable spectrum of discourse data. There is still a question, however, that the present paper cannot hope to give an answer to, namely -- To what extent is there a chance of sensible and sustained dialogue between the two approaches ? The extent to which TL and DA will go on ignoring each other is a problem the solution of which depends ultimately on the amount of pressure put by societal research tasks upon the subdiscipline in question. The other worrying point regarding TL is that it tends to deprive conventional linguistics of its fundamental functions: "Ausgangspunkt einer Phänomenologie des linguistischen Objekts ist die Texthaftigkeit des originären sprachlichen  Zeichens," says P. Hartmann (1971 : 12), and S. J. Schmidt (1973 : 10) adds: "Diese Entwicklung hat sich über einige bedeutsame Stadien in der Forschungsgeschichte der Linguistik vollzogen, die unter dem Schlagwort zusammengefasst werden können: 'Von der Satzgrammatik zur Textgrammatik' ". All this boils down to the following postulate:

PROPOSITION 15: TL is a more comprehensive approach to language study than conventional linguistics, both more suitable and more complete; as such, it is bound to replace conventional linguistics by ousting and eliminating narrow sentential preoccupations.

        DA approaches, on the other hand, seem to be far more modest on this particular point in that they express a clear awareness of distinct goals and specific targets, and consequently pledge non-interference in the internal affairs of sentential linguistics, with the newly emerging subdiscipline of DA modestly and peacefully coexisting alongside its more venerable and far more respectable counterpart.

9. Pragmatics -- the gate-crasher. Immoderately high hopes have in the last few months been placed on pragmatics, proof thereof being the very name of the Congress section in which this paper has been given (it must be said, by way of digression, that its actual name -- Pragmalinguistics -- is a misnomer derived from obvious theoretical confusion; what is probably meant is 'linguistic pragmatics', as opposed to the pragmatics of non-verbal communication). But by pragmatics, different approaches understand different things: a good illustration is the question sometimes asked, ludicrous as it is, whether pragmatic presuppositions belong to semantics at all. It is only too often forgotten (Chomsky himself having initiated this voluntary oblivion...) that language is but another system of signs; and all semiotic systems, language too therefore, are characterized by three distinct components -- a syntactic component, dealing with item sequentialization (both below and above sentence level), a semantic component, dealing with problems of meaning, and a pragmatic component, deling with all the 'inter-personal' aspects of the communication act and the communication situation (cf the interpersonal function of language discussed by Halliday (973 : 105ff), alongside the 'ideational' and 'textual' functions. As such, all genuine linguistics is pragmatics-oriented:

PROPOSITION 16: To the extent linguistics does not evince a pragmatic component clearly, it is bound to remain pseudo-linguistics.

        One of the fundamental tasks of pragmatics in assisting discourse analysis is to establish a consistent situational typology on the basis of the following factors: role interplay, role relationship and size, role structure (quantity and quality), channel range, channel switch, repertoire range, code availability, turn taking, turn rules, turn claiming, turn yielding, turn suppressing, turn opening, turn closing etc. Most micro- sociolinguistic investigations can thus be bracketed under pragmatics. One further task of pragmatics in order to promote the structure of discourse mapping is to provide a consistent and harmonious integration of the verbal into the non-verbal, and of the non-verbal into the verbal (which any Comsky-based approach would label as patent heresy...):   

PROPOSITION 17: Both TL and DA have so far lamentably failed to provide a solid bridge between the verbal and the non-verbal, this being the outcome of the total disregard of the concrete semiotic parameters of the communication act.

        To go on speaking about the English used by teachers and pupils (or doctors and patients, for that matter) without a minimally rigorous analysis of the pragmatic factors characterizing the particular communication situation, may and does look to TL both subscientific and amateurish, and boils down to a random collection of situations reminiscent of conversation manuals. To go on talking, however, about vague, abstruse, and unnecessarily abstract theoretical constructs (cf Lewandowski 1975 : 733-69, for a relatively representative sample) with insufficient empirical substantiation sounds both futile and impractical, when confronted with the more concrete DA research targets. The present critical survey of the literature related to both TL and DA emphasizes that what is needed in order to bridge gaps like the one separating lofty TL from pedestrian DA is solid and extensive descriptions of the various pragmatic parameters of the respective communication acts and situations as well as an outline of the specific features of role structures.

10. Conclusions. This paper has been an attempt to bridge gaps.

10.1    Both TL and DA, as distinct approaches within the same subdiscipline, suffer acutely, but in different ways, from what Roman Jakobson (1974, 1975) has called 'glottocentrism' in that they evince a total inability to cope adequately, suitably and harmoniously with extra-linguistic information.

10.2    DA is a data-bound approach, deriving its theoretical constructs directly or indirectly from previously collected token data; TL is an almost exclusively model-centred  approach in that it fits the data to a theoretical model, which ultimately is an adapted extension of sentence linguistics.

10.3    TL views text as an abstract object (cf Petöfi 1973 : 3); in the other approach, 'discourse is produced in real time (cf Sinclair 1975 : 34).

10.4    Both approaches are characterized by theoretical inconsistencies of various types, ultimately caused by the absence of a background model wider than linguistics (as this discipline has so far been understood). This leads to the incapacity to accommodate  'non-glottocentric' types of information.

10.5    The data-centrists happen to be generally permissive and tolerant towards other approaches; the model-centrists are fundamentally exclusivistic (e.g. Rieser 1975). This statement has ultimately to do with the sociology of linguists, that is, who reads whom, and for what purposes; the philosopher of science Thomas Kuhn (1962) would modify this statement to cover also who quotes whom and to what ends.

10.6    One approach has the conviction that it caters for highbrow linguistic tastes, whereas the other approach is quite aware of the fact that it aims primarily at the lowbrow.

10.7    In terms of future research -- that is, long-term perspectives as well as short-term ones --, we could possibly distinguish two tendencies (or trends) in DA: (a) a linguistics- and language-based trend, and (b) a semiotics-based approach with powerful emphasis on verbal/non-verbal (cf S.W.P.D.A., No.1 (September 1974), p. 17).

10.8    In the foregoing discussion, we have presented TL and DA as homogeneous approaches, which very obviously they are not. Though we are quite aware of certain oversimplifications, it must be emphasized that it is part of the endeavour to extract the homogeneous features out of relatively heterogeneous and diverse material.

