Document Type : Research Article
Authors
1 PhD Candidate , Department of English, Faculty of Persian Literature and Foreign Languages, University of Tabriz, Tabriz, Iran
2 Professor, Department of English, Faculty of Persian Literature and Foreign Languages, University of Tabriz, Tabriz, Iran
3 Assistant Professor, Department of English Language and Literature, faculty of Literature and Humanities, Urmia University, Urmia, Iran
Abstract
Keywords
Main Subjects
Introduction
Researchers in the areas of applied linguistics and instructed second language acquisition have always been interested in measuring second language performance. Brumfit (1979) proposed accuracy on the one hand and fluency on the other hand as two important aspects of language use. Skehan (1998) added complexity as another important aspect of language use, and thereby the triad of complexity, accuracy, and fluency (CAF) were introduced as the three fundamental dimensions that characterize second language performance. These three components have proved useful measures of second language performance (Skehan, 2009b).
Ellis (2003) offers the following working definitions for the three dimensions. Complexity refers to the extent to which the language produced by the learners is elaborate and varied. It is divided into syntactic and lexical complexity. Accuracy is defined as the extent to which the language produced by the learner conforms with target language norms. Fluency refers to the extent to which the language produced by the learner manifests pausing, hesitation, or reformulation. Skehan (2009b) characterizes successful task-based performance as containing “more advanced language, leading to complexity; a concern to avoid errors, leading to higher accuracy if this is achieved; and the capacity to produce speech at normal rate and without interruption, resulting in greater fluency” (p. 510).
In terms of the cognitive underpinnings of CAF, complexity and accuracy are associated with the current state of the learner’s L2 knowledge representation and restructuring, while fluency is related to control and automatization of L2 knowledge (Housen, Kuiken, & Vedder, 2012; Skehan, 2009b). Two competing models have been proposed to account for the complexity, accuracy, and fluency of L2 learners’ production in task performance. The Trade-off Hypothesis (Skehan, 1998, 2003) argues that humans have a limited processing capacity and attending to one dimension of language production may take away attention from others. According to the Trade-off Hypothesis, raised performance in one dimension may be achieved at the expense of performance in other dimensions. This competition shows itself most prominently in the prioritization of meaning (complexity) over form (accuracy) in tasks that are cognitively more demanding. In contrast, the Cognition Hypothesis (Robinson, 2001; 2003; 2005) argues that human attention resources are multiple and that speakers have the capacity to handle different demands on their attention simultaneously. As a result, complexity and accuracy can go together. Testing these two rival models has proven difficult, in part due to the lack of conceptual and operational clarity of the dependent variables (Housen et al., 2012). Therefore, the results of empirical studies so far have not been consistent and do not equivocally support either of the two models (Robinson, 2011; Robinson & Gilabert, 2007; Skehan, 2009b).
In the past two decades, investigating the effects of such independent variables as task complexity (e.g., Frear & Bitchener, 2015; Kuiken & Velder, 2008), task type
(e.g., Olinghouse & Wilson, 2013; Yoon & Polio, 2017), task repetition (e.g., Bygate, 2001; Lynch & McLean, 2001; Thai & Boers, 2016, and planning (e.g., Ellis & Yuan, 2005; Yuan & Ellis, 2003) on the complexity, accuracy, and fluency of second language learners’ linguistic performance on pedagogical tasks has been a thriving area of research. In the Iranian context, too, several researchers have tried to investigate the effect of manipulating cognitive task complexity on L2 learners’ performance (e.g., Ahmadian & Tavakoli, 2011; Ahmadian, Tavakoli, & Dastjerdi, 2015; Birjandi & Alipour, 2010). Regarding the effect of task type on L2 learners’ performance on tasks, Skehan (2009b) states that earlier research within a CAF framework confirmed generalizations such as the following:
One shortcoming of the research done so far on task performance is that it has focused mostly on the syntactic aspect of complexity, with very few studies investigating the lexical aspect of this performance area. Skehan (2009b) states that lexis has been strikingly absent in task research and that it is vital to incorporate some measures of lexis into task performance. The three dimensions of complexity, accuracy, and fluency thus need to be supplemented by measures of lexical performance. The range of measures also needs to be widened to cover this additional area. Most of the studies conducted in the area of task performance have used only lexical diversity as the measure of lexical performance. We also need to consider how different measures of lexical performance correlate and how the lexical measures relate to other measures – whether, for example, they relate to syntactic complexity, accuracy, or neither (Skehan, 2009a). It is important to investigate the relationship between lexical complexity and syntactic complexity since one can debate whether it is better to consider lexis as a separate area, or whether it is sufficient to include it within complexity, so that structural complexity and lexical complexity would be considered to be different aspects of the same performance area (Skehan, 2009b).
Review of the Literature
Linguistic Complexity
The complexity component of the CAF triad is divided into lexical and syntactic complexity. Lexical complexity is a multidimensional feature of language use which encompasses diversity, sophistication, and density (Wolfe-Quintero, Inagaki, & Kim 1998; Read, 2000). Research into lexical measures also makes a distinction between text-internal and text-external measures (Daller, Van Hour, & Treffers-Daller, 2003). The text itself is enough to calculate text-internal measures, while text-external measures require some sort of reference material, which are usually based on word frequency lists. Lexical diversity is an example of text-internal measures, which is typically measured through some sort of type-token ratio (TTR). A serious problem with TTR measures is that they are affected by text-length or sample size and a correction has to be made (Malvern & Richards, 2002). A general accepted measure of lexical diversity is D (Malvern & Richards, 2002; Richards & Malvern, 2007), which is calculated by the VOCD sub-routine within Computerized Language Analysis (CLAN) (MacWhinney, 2000). For the present, D seems to be the best measure of lexical diversity (Jarvis, 2002; McCarthy & Jarvis, 2007). One may ask what such a measure measures. “D provides an index of the extent to which the speaker avoids the recycling of the same set of words. If a text has a lower D, it suggests that the person producing the (spoken or written) text is more reliant on a set of words to which he or she returns often.” (Skehan, 2009a, p. 108).
In contrast, measures of what is called lexical sophistication (Read, 2000) take frequency lists from corpus analysis and then compute how many words defined as difficult are used in a text, with difficulty being defined on the basis of lower frequencies. Laufer and Nation’s (1999) Lexical Frequency Profile is the most well-known measure of this sort. The profile provides information on the number of words in a text drawn from the 1000 word levels, the number drawn from the 2000 word levels, and so on. It enables a judgement to be made regarding the extent to which very frequent words are relied upon less. An alternative measure is P_Lex developed by Meara and Bell (2001), which uses a mathematical modelling procedure. It divides a text into ten-word chunks and computes the number of infrequent words in each ten-word chunk.
Lexical density is defined as the ratio of lexical words (or content words) to the total number of words. (Ure, 1971). Lexical words include nouns, adjectives, verbs, and adverbs and give a text its meaning and provide information regarding what the text is about. Other kinds of words such as articles, prepositions, and conjunctions are more grammatical in nature and give little or no information about what a text is about. These non-lexical words are called function words. Lexical density is simply a measure of how informative a text is. Spoken texts tend to have a lower lexical density than written ones (Halliday, 1985).
The other aspect of complexity, syntactic complexity, is defined as the range of forms and the degree of sophistication that surface in language production (Ortega, 2003). A wide range of measures has been proposed to cover different subcomponents of syntactic complexity, which include length of production, sentence complexity, subordination, coordination, and the use of particular grammatical structures. (Bulté & Housen, 2012).
Empirical Studies on the Relationship between Lexical and Syntactic Complexity
Using data from six studies, Skehan (2009a) conducted a meta-analysis on the relationship between lexical diversity and lexical sophistication and the relationship between these two aspects of lexical complexity and other aspects of performance, such as syntactic complexity, accuracy, and fluency. The six studies used a range of task types and task characteristics, falling into one of three categories: personal information exchange; narratives, either based on picture stories or on a video; and decision-making, where students were required to make decisions. The six studies used to form the basis for the meta-analysis were Foster and Skehan (1996, 2013), Skehan and Foster (1997, 1999, 2015), and Foster (2001). The studies used D as the measure of lexical diversity and lambda as the measure of lexical sophistication. Syntactic complexity was operationalized as the mean number of clauses per ASU, which is an index of subordination.
In a longitudinal case study, Kalantari and Gholami (2017) explored Iranian EFL learners’ lexical complexity development over a period of six months in the essays written by five intermediate to advanced EFL learners. They also investigated the correlation among lexical complexity indices. The results indicated that there was a positive correlation between lexical density and lexical sophistication. Lexical diversity, however, did not correlate significantly with both lexical density and lexical sophistication.
Regarding the relationship between D and lambda, Skehan (2009a) states that “the basic conclusion is unavailable – the level of relationship between these two measures is very low at best, and more probably, non-existent” (p. 115). He makes the general conclusion that lexical diversity and lexical sophistication are independent of one another. The general conclusion applies to native speakers and non-native speakers alike, and across the three different task types.
As for the relationship of lexical measures to syntactic complexity, the patterns of relationships differed for native and non-native speakers. For non-native speakers, the relationship between lambda and syntactic complexity was mainly negative. This shows that, for non-native speakers, less frequent words are associated with lower syntactic complexity. “More varied lexis seems to cause problems for non-native speakers and provokes more errors while not driving forward complexity” (Skehan, 2009a, p. 116). For native speakers, the relationship between lambda and syntactic complexity was positive. Less frequent words seem to push native speakers to use more complex language, and native speakers seem to be able to handle the consequences of lemma retrieval without disruption (Skehan, 2009a). Finally, in the majority of cases D correlated negatively with syntactic complexity for non-native speakers. In contrast, no correlation was found between D and syntactic complexity in the performance of native speakers.
The present study aims to further explore the lexical aspect of task performance through investigating the relationship between different aspects of lexical complexity and how they are related to syntactic complexity in the speech monologues of Iranian EFL learners. We chose to explore speech and not writing because the two major hypotheses on task performance proposed by Skehan (1998, 2003) and Robinson (2001, 2003, 2005) are primarily developed to explain oral production and not writing performance. Also, we want our study to shed light on models of speaking and not writing. Monologues, and not dialogues, were elicited from the participants because mode affects task performance and because we were only interested in exploring the relationship between the variables of the study across different task types and not across different modes. The two measures of lexical complexity used in the current study are lexical diversity and lexical sophistication. Syntactic complexity is measured through mean number of clauses per ASU as an index of subordination. Specifically, the current study is an attempt to answer the following two research questions:
1. What is the relationship between lexical diversity and lexical sophistication?
2. What is the relationship between lexical complexity and syntactic complexity?
Methodology
Participants
The participants in the study were 35 high-intermediate learners of English as a foreign language in the adults’ department at a private language institute from two intact classes who participated in the study voluntarily. They ranged in age from 18 to 32 (mean = 21.97,
SD = 3.75). Twenty-two (62.9%) participants were male and 13 (37.1%) were female. The participants had studied English for 12 semesters in the institute’s regular classes. They studied English two sessions a week, each session lasting two hours. They had also studied English as a school subject two hours a week for six years in the Iranian national education system. None of the participants had ever lived or stayed in an English-speaking country. All the participants in the study signed an informed written consent form.
Tasks
Three tasks were used to elicit spontaneous speech monologs from the participants. In the argumentation task, the participants were asked to respond to the question whether money can make people happy. The description task required the participants to describe someone they enjoyed spending time with. In the narration task, the participants were first asked to inspect a series of cartoon pictures with no text and then to narrate a story based on the pictures.
Measures
Syntactic Complexity: Syntactic complexity was operationalized as the mean number of clauses per Analysis of Speech Unit (ASU), which is an index of subordination. The web-based L2 syntactic complexity analyzer developed by professor Xiaofei Lu (Lu, 2011) at the Pennsylvania State University available at www.aihayyang.com/software/l2sca was used to measure mean number of clauses per ASU.
Lexical Diversity: D was used as a measure of lexical diversity. The VOCD subprogram available at www.textinspector.com was used to calculate D.
Lexical Sophistication: Lexical sophistication was operationalized as the percentage of words beyond the 2000 most frequent words based on Corpus of Contemporary American English (COCA) frequency lists. The lexical tools available at www.textinspector.com were used to calculate the number of types beyond the 2000 most frequent words. The number was then divided by the total number of types to obtain the lexical sophistication measure.
Procedure
Each participant was interviewed individually. The participants’ responses were recorded using a ZOOM H4 digital voice recorder with a connected microphone placed at a distance of five centimeters from the speaker’s mouth. The three tasks were presented to the participants in a counterbalanced order. For each task, the participant was given 30 seconds to plan his/her response, during which time they were not allowed to take notes. The recorded performances were then transcribed as Word documents. The Analysis of Speech Unit (ASU) defined by Foster, Tonkyn, and Wigglesworth (2000) as “a single speaker’s utterance consisting of an independent clause or sub-clausal unit, together with any subordinate clause(s) associated with either” (p. 365) was used as the unit of analysis. The transcriptions were pruned by removing false starts, repetitions, and self-corrections. The pruned transcriptions were then coded for the three measures that were used to operationalize the different aspects of the two dependent variables in the study.
Research Design and Data Analysis
This research project has a correlational design. The three measures whose correlations were analyzed were syntactic complexity, lexical diversity, and lexical sophistication. The collected data in the present study consists of nine subsets of data. For each of the three tasks, the transcriptions were coded for three measures of lexical diversity, lexical sophistication, and mean number of clauses per ASU, hence each task yielding three subsets of data, making nine subsets of data altogether. Kolmogorov-Smirnov tests were run to test if the data subsets are normally distributed or not. Since six out of the nine sunsets of data were not normally distributed, it was decided to use the non-parametric Spearman correlation test to investigate the relationship between the variables. The significance level was set at 0.05 for all the statistical analyses run in the study.
Results
The mean scores for lexical diversity, lexical sophistication, and syntactic complexity for the three tasks and the overall average scores for these dependent variables are shown in Table 1. As can be seen, the highest mean lexical diversity score was obtained for the argumentation task, and the highest mean lexical sophistication score was obtained for the narration task. The argumentation task also yielded the highest mean score for syntactic complexity.
Table 1. Descriptive Statistics for the Mean Scores for Lexical Diversity, Lexical Sophistication, and Syntactic Complexity for the Three Tasks and the Overall Average Scores
Measure |
argumentation |
description |
narration |
average |
Lexical diversity |
54.44 (11.59) |
54.03 (12.47) |
36.66 (9.29) |
44.38 (8.59) |
Lexical sophistication |
0.12 (0.07) |
0.13 (0.08) |
0.19 (0.09) |
0.15 (0.07) |
Syntactic complexity |
1.61 (0.04) |
1.41 (0.04) |
1.27 (0.02) |
1.43 (0.11) |
Standard deviations are given in parentheses.
The Relationship between Lexical Diversity and Lexical Sophistication
The first research question concerned the relationship between the two aspects of lexical complexity, that is lexical diversity and lexical sophistication. A Spearman correlation test was conducted to investigate the relationship between the two variables. The results are presented in Table 2. As can be seen, there is a positive correlation between the overall average scores for lexical diversity and lexical sophistication. There is also a strong positive correlation between the two measures in the narration task. The positive correlation between the two measures is significant neither in the argumentation task nor in the description task. Overall, there is a positive correlation between the two aspects of lexical complexity. However, the correlation pattern varies across the three tasks. There is a strong correlation between the two measures in the case of the narration task, while the weak correlations between the two measures in the cases of the argumentation and description tasks are not significant.
Table 2. Spearman Correlation Coefficients between Lexical Diversity and Lexical Sophistication for the Three Tasks and the Average Scores
Lexical diversity and lexical sophistication |
argumentation |
description |
narration |
average |
Spearman’s rho |
0.32 |
0.33 |
0.700** |
0.58** |
Sig. (2-tailed) |
0.059 |
0.056 |
0.000 |
0.000 |
N |
35 |
35 |
35 |
35 |
*Correlation is significant at 0.05 level (2-tailed); **Correlation is significant at 0.01 level (2-tailed)
The Relationship between Aspects of Lexical Complexity and Syntactic Complexity
The second research question concerned the relationship between the two aspects of lexical complexity and syntactic complexity. The results of the Spearman correlation test conducted to investigate the relationship between each aspect of lexical complexity and syntactic complexity are shown in Table 3.
As can be seen, the positive correlation between the overall average scores for the diversity aspect of lexical complexity and syntactic complexity is not significant. However, the correlation pattern varies across the three tasks. The correlation between the two measures is significant in the cases of the argumentation task and the narration task. In the case of the description task, there is a negative correlation between the two measures which is not significant. Overall. There is not a significant correlation between lexical diversity and syntactic complexity. However, the correlation varies greatly across the three tasks, which makes any kind of generalization very difficult.
The overall scores for the other aspect of lexical complexity, i.e., lexical sophistication, and syntactic complexity do not correlate significantly. Again, the correlation pattern varies greatly across the three tasks. There is a significant positive correlation between the two measures in the case of the narration task, while the positive correlation between them is not significant in the case of the description task. The negative correlation between the two measures in the case of the argumentation task is not significant. Overall, there is not a positive correlation between lexical sophistication and syntactic complexity. However, the correlation pattern is not consistent across the three tasks.
Table 3. Spearman Correlation Coefficients between Aspects of Lexical Complexity and Syntactic Complexity for the Three Tasks and the Average Scores
|
Syntactic complexity |
||||
argumentation |
description |
narration |
average |
||
Lexical diversity |
Spearman’s rho |
0.39* |
-0.021 |
0.47** |
0.193 |
Sig. (2-tailed) |
0.019 |
0.906 |
0.004 |
0.268 |
|
N |
35 |
35 |
35 |
35 |
|
Lexical sophistication |
Spearman’s rho |
-0.068 |
0.166 |
0.390* |
0.158 |
Sig. (2-tailed) |
0.699 |
0.339 |
0.020 |
0.366 |
|
N |
35 |
35 |
35 |
35 |
*Correlation is significant at 0.05 level (2-tailed); **Correlation is significant at 0.01 level (2-tailed)
Discussion
We investigated the relationship between lexical diversity and lexical sophistication as two different aspects of lexical complexity and also how these two aspects relate to syntactic complexity. The non-parametric Spearman correlation test was used to measure the correlations between each pair of the three dependent variables in the study. D was used as a measure of lexical diversity, and lexical sophistication was defined as the percentage of words beyond the 2000 most frequent words based on COCA frequency lists. Syntactic complexity was defined as the mean number of clauses per ASU as an index of subordination, which is the most commonly used measure of syntactic complexity in the literature.
Regarding the first research question, the results of the correlation test run to investigate the relationship between lexical diversity and lexical sophistication showed that there was a positive correlation between the average scores for lexical diversity and lexical sophistication. However, the pattern of results was not consistent across the three tasks. Despite the strong positive correlation between the two aspects of lexical complexity in the case of the narration task, the positive correlation between the two measures were not significant. This shows that lexical diversity and lexical sophistication are independent of each other, at least in the case of the argumentation and description tasks. In other words, the ability to avoid the recycling of words and to mobilize a wider range of words is not related to the ability to use the less frequent words. These results are in line with what was reported in Skehan’s (2009a) meta-analysis and the results obtained by Kalantari and Gholami (2017), where no significant relationship was found between lexical diversity and lexical sophistication.
As for the second research question, which concerned the relationship between the two aspects of lexical complexity with syntactic complexity, the pattern of results was not very clear. Neither of the two lexical measures correlated significantly with syntactic complexity in the case of the average scores for the three tasks. However, the positive correlation between lexical diversity and syntactic complexity was significant in the cases of the argumentation and narration tasks. The negative correlation between the two measures or the description task was not significant. These results taken as a whole are not consistent with what Skehan (2009a) reported in his meta-analysis of a database consisting of the six studies mentioned earlier. He found negative correlations between lexical diversity and syntactic complexity for non-native speakers in the majority of cases, while native speakers show no correlation between lexical diversity and syntactic complexity.
The pattern of results for the relationship between lexical sophistication and syntactic complexity differed across the three tasks. There was a non-significant negative correlation between the two measures in the case of the argumentation task. The positive correlation between the two measures was not significant for the description task but significant in the case of the narration task. These results again are not consistent with what Skehan (2009a) reported about the relationship between lexical sophistication and syntactic complexity for non-native speakers. While the two measures correlated positively for native speakers, the relationship was negative for non-native speakers. He argues that, for non-native speakers, using less frequent words seems to cause problems for non-native speakers and provokes more errors, while it does not seem to cause syntactic complexity. “There seems, in other words, to be something of a toll for those who mobilize less frequent lexical items, in that the syntactic implications of such words derail, rather than build, syntax” (Skehan, 2009a, p. 116).
The inconsistencies between the findings of the present study and those reported in the meta-analysis by Skehan (2009a) could be attributed to the fact that the participants in the present study were relatively high in proficiency. The participants in the six studies comprising the dataset for the meta-analysis were intermediate and low-intermediate learners of English. It might well be the case that, as proficiency increases, the performance of second language learners approximates that of native speakers, hence the observed positive correlation between lexical sophistication and syntactic complexity in the case of the narration task in the present study could be attributed to their relatively high proficiency level.
Drawing on Levelt’s speaking model (1989), with its three major stages in speech production of conceptualization, formulation, and articulation, Skehan (2009a) argues that, for non-native speakers, the lexical choices implied by the preverbal message creates difficulty for the formulator and disrupts syntactic planning and that “lexis does not drive syntax in the same way as with native speakers” (p. 117). As was mentioned earlier, non-native performance in terms of the relationship between lexis and syntax may become more nativelike as proficiency increases.
Conclusion
The picture that emerges from the results of the present study is that the two aspects of lexical complexity, namely lexical diversity and lexical sophistication are not related to one another and that the two aspects of lexical complexity correlate differently with syntactic complexity. The general conclusion is that the relationship between lexis and syntax varies across different task types. Given the lack of correlation between the two aspects of lexical complexity and the weak correlation between the lexical and syntactic aspects of complexity, one of the implications of the present study is that more than one measure of linguistic complexity should be used to analyze and assess L2 learners’ performance. Also, different tasks and activities should be used in the classroom to develop the different aspect of linguistic complexity, as an important aspect of the general notion of L2 proficiency and performance. One of the limitations of the present study is that we did not use participants across different proficiency levels. It could be illuminating to do research with higher proficiency levels, especially with advanced learners of English, to explore whether higher levels of proficiency are associated with a greater correspondence between lexis and syntax. Another limitation of this study is that only one measure of syntactic complexity, namely mean number of clauses per ASU, which is an index of subordination, was used to explore the relationship between lexis and syntax. Future studies can use measures that target sentential and phrasal levels of syntactic complexity, too. Also, futures studies can investigate the relationship between the two aspects of linguistic complexity across different modes (monologic versus dialogic) and different mediums (spoken versus written).