Classification of EFL Students: EFL Teachers’ Criteria and a Case Study

Document Type: Original Article

Authors

Shahid Chamran University of Ahvaz

Abstract

Language learners have frequently been classified according to individual difference variables such as aptitude, personality, cognitive style, and motivation. However, a language teacher’s view seems to have been missing from such classifications. This exploratory research investigated whether and by which criteria Iranian EFL teachers classify their students. Based on preliminary interviews with 29 high-expertise Iranian EFL teachers, 21 criteria were identified and included in a questionnaire that was completed by 175 Iranian EFL teachers. The respondents almost unanimously agreed that they did classify their students according to their understanding of the character type, behavior patterns, and achievement patterns of their students. Then they rated the 21 criteria on a scale from 0 to 4 according to how important each classification criterion was for them. Factor analysis of questionnaire responses revealed six major classification criteria. Subsequently, in a case study, 26 EFL students in a typical Iranian high school class were asked to rate their classmates according to the six major criteria. Only five of the criteria were found to predict English achievement and Grade Point Average (GPA). A cluster analysis of the students’ peer ratings using the five criteria generated three clusters. An ANOVA revealed that the three clusters were accurately differentiated not only on the clustering criteria but also on the two non-clustering variables: EFL Achievement and GPA.

Keywords

Main Subjects


Introduction

Language teachers often find themselves making classifications among their students and viewing them in terms of the characteristics of particular categories or types. Dörnyei (2014), for instance, acknowledges that when language teachers look at a classroom, they soon “notice typical learner behaviors” and “recognize learner types” and they are most probably able to categorize the learners’ performance under “typical forms of achievement” (p. 85). The acts of noticing, recognition, and categorization that Dörnyei talks about are very similar processes; they are to know what kind something is, and knowing what kind an object is enables us to know “what inferences we can make about it and what generalizations apply to it as a member of that kind” (Kendig, 2016, p. 1). When we categorize an object, we assign it the properties shared by other category members and thereby we can save time and intellectual resources. In fact, categorization is a fundamental process by which we understand, learn, make decisions, and interact with our environment (Harnad, 2005).

Classification and categorization, which are often (and throughout this paper) taken to be synonymous terms (Hjørland, 2017), seem to be frequently employed by teachers as they use labels to talk about their students (e.g. studious, talkative, disrespectful). For instance, labels such as “the attention-seeking”, “the unprepared”, or “the game players” have been used to capture challenging or disruptive behavior of college students in terms of student types (McKeachie & Svinicki, 2014; Seeman, 2010). Another conceptual categorization was suggested by Good and Power (1976) who believed their hypothesized fivefold student typology was readily identifiable by teachers: “success”, “social”, “dependent”, “alienated’, and “phantom” students. However, Richards and Lockhart (1996) downplay such classification systems as being arbitrary and only useful for emphasizing that “individual students may favor different interactional styles” (p. 146). More empirically-based typologies have been developed based on self-report measures of college student behaviors, attitudes, expectations, values, self-concept, or engagement in college activities (Aliaga, Kotamraju, & Stone, 2012; Astin, 1993; Cheong & Ong, 2014; Hu & McCormick, 2012; Kuh, Hu, & Vesper, 2000; Luan, Zhao, & Hayek, 2009).

With the existence of student types established and validated by educational researchers, one might wonder whether particular student types could be identified in language teaching contexts. Applied linguists have often classified language learners according to individual difference (ID) factors such as aptitude, personality, cognitive style, and motivation (Dörnyei & Skehan, 2003; Dörnyei, 2005). Individual learner characteristics have recently been conceptualized as complex constellations of interdependent variables that are in constant interaction with each other and the environment (Dewaele, 2013; Dörnyei & Ryan, 2015). This position is in line with the recent turn in second language acquisition (SLA) research toward a complex dynamic systems approach (see e.g., Dörnyei, MacIntyre, & Henry, 2015; Ellis & Larsen-Freeman, 2006; Larsen-Freeman & Cameron, 2008). Looking from this perspective, what could be particularly interesting for language teaching practitioners is to establish what combination of individual learner characteristics may contribute to effective L2 learning.

From a complex dynamic systems viewpoint, Dörnyei (2014) suggests that the interconnectedness of variables within one individual can affect the evolution of any single variable, thereby giving rise to a limited number of typical patterns of learner behavior. These patterns can be intuitively identifiable by experienced language teachers, and quite presumably they may be reflected in the way language teachers perceive student behavior patterns and categorize their students. What seems to be a useful research agenda then is empirically investigating whether and by which criteria student categorization is realized in the actual language teaching practice. After all, the classification of learner types by psychometric measures or self-report instruments, as mentioned above, may well be different from what language teachers experience in the classroom.

Furthermore, individual differences researchers have long tried to address the issue of pedagogic interventions (Biedroń & Pawlak, 2016; Gregersen & MacIntyre, 2014). Identifying the gaps between ID research findings and language teachers’ actual classroom practice could greatly assist language teaching researchers in pinpointing what combination of learner attributes may be either facilitative or debilitative to successful SLA, and in formulating practical pedagogical implications that help language teachers cater to various learner psychological profiles.

Much to our dismay, little research seems to have been conducted in this regard. Several simple and exact phrase searches on Google and Google Scholar with many different combinations of keywords such as “student/learner type”, “typology of EFL/ESL learners”, “teachers’ perceptions of EFL/ESL student types”, or “EFL/ESL teachers’ criteria for student categorization” resulted in almost no research papers specifically exploring a language teacher’s view of student classification. It seemed to the authors that there is a real phenomenon out there in the actual world of language teaching, but which has yet to be explored by systematic research. Therefore, to address the above lacuna, the following research questions were posed in the Iranian context as an EFL example:

a. Do Iranian EFL teachers classify their students into distinct categories?

b. What criteria do they use for the classifications they make?

c. To what extent may the classifications made according to the identified classification criteria reflect actual differences among EFL students in a typical EFL class?

d. To what extent can the identified set of classification criteria be used to predict achievement in a typical EFL class?

 

Method

In view of a lack of empirical framework to build upon, the most feasible research design was an exploratory study to try to better understand the problem, gain insights and familiarity, and develop preliminary ideas about the issue (Neuman, 2014; Labaree, 2017).

There are two parts to this investigation: a survey and a case study. In the survey study, a group of Iranian EFL teachers were interviewed to investigate whether and by which criteria they classify their students in categories. Then a questionnaire was administered to a larger sample of Iranian EFL teachers to verify how closely the identified categorization criteria accords with the common practice of Iranian EFL teachers.

Then a case study with a class of EFL students was carried out to investigate the applicability and utility of the identified classification criteria in making classifications that reflected actual variations among the students. Dörnyei (2007) and Labaree (2017) argue that case study is a useful research tool for testing whether a theory or model actually applies to phenomena in the real world.

 

Participants

Initially, a group of 29 Iranian EFL teachers (19 men, 10 women) participated in preliminary interviews. They were all well-versed teachers with more than 16 years of teaching experience, and they were identified as high-expertise teachers by the heads of English departments in the district where they taught, in Mashhad, Iran. Since heads of English departments are typically in direct contact with most EFL teachers in a district as part of their job description (The Comprehensive Directive for Educational Departments, 1998, p. 11), they were presumed to have firsthand information on the EFL teacher’s expertise.

Another group of participants were 175 EFL teachers in Iran, who filled out an online questionnaire. These EFL teachers were either members of various Iranian professional ELT groups formed on Telegram Messenger, or EFL teachers working in various high schools or private English schools in Iran. For consideration of availability, a quota sampling strategy was adopted (Ary, Jacobs, & Sorensen, 2010; Brown, 2013) with a relatively equal number of teachers from both major school types (i.e. public high schools and private English schools), from both sexes, and from different levels of experience.

For the case study, a typical class of 26 male EFL learners was selected. They were tenth-grade students of humanities studying in a typical state-run public high school in Mashhad, Iran.

 

Research Instruments and Procedures

The survey study consisted of preliminary short interviews and an online questionnaire survey.

 

The Interviews

The 29 EFL teachers were invited to short unstructured interviews to examine whether and by which criteria they classified their students in categories. The interviews were conducted in Persian, the participants’ native language. The teachers were all asked an opening question similar to this one: “When you enter a classroom to teach a group of EFL students whom you have taught for some time, say a month, do you often find groups of students with similar characteristics, and thus regard your students as more or less belonging to distinct categories?” Where the answer was positive, the teachers were asked what criteria they used for their categorizations. The interviews were digitally recorded and listened to several times and any criterion mentioned was written down.

 

The Questionnaire

Based on the results of the interview analyses, a questionnaire was developed. To ensure accurate responses and a higher response rate, research experts advise keeping the questionnaire short and only involving core concerns (Coombe & Davidson, 2015; Nielsen, 2004). To that end, similar criteria were combined into single composite variables such as being friendly/sociable/outgoing/energetic, or being touchy/impulsive/vulnerable/emotionally unstable. The intention was to avoid the fatigue effect (Dörnyei, 2010) by preventing the list of items from getting too long and causing monotony or boredom. Eventually, 21 criteria were included in the questionnaire (see Appendix 1 for an English translation).

The questionnaire was developed in Persian, as the use of the participants’ native language is believed to positively affect the quality of the obtained data (Dörnyei & Csizér, 2012). The questionnaire was published online using Google Forms. The link to the questionnaire along with a short introduction (similar to a cover letter) was posted in various professional groups of Iranian EFL teachers on the Telegram app. Alternatively, the questionnaire was orally introduced either by the first author or by pre-instructed associates, in a number of high schools or private English schools in Mashhad, Shiraz, Sari, and Abadan, Iran. After an oral presentation, the link to the questionnaire was given to those teachers who were willing to participate.

 

The Case Study

Initially the student participants were asked to rate their classmates according to the six major criteria identified in the factor analysis (see Section 3.1.1 below). Each student was provided with a chart (given in Appendix 2), where the six criteria were indicated on the horizontal axis and on the vertical axis there was a scale from 0 to 100 percent. Students were instructed to choose those of their classmates that they felt they knew well enough, assign them each a symbol of their own choice (e.g. shapes, or alphabet letters) and use that symbol to rate each individual on the chart. In order to enhance comprehension, the highest contributing items to each factor were given in parentheses to clarify the dimensions of the criterion.

To increase accuracy and mitigate the effect of probable initial misunderstanding of the criteria or instructions, about two weeks after the first rating session, each student was provided with an identical rating chart and a list of the individuals whom they had previously rated, and they were asked to rate the same individuals again. But they were allowed to additionally rate other classmates if they felt so inclined. During both rating sessions, the first author was present and provided answers to any ambiguities.

The average of the two rating values for each individual on each criterion was computed from the two rating charts collected from each student. The number of students that rated each individual varied depending on how many students felt they knew that individual well enough to rate him. However, no individual was rated by fewer than 5 classmates, and the mean number of raters per individual was 9.88, meaning that, on average, every individual was rated by about 10 classmates.

Finally, the mean ratings on each criterion for each individual was computed separately. These mean rating values are provided in Appendix 3 to allow for replicative study or scrutiny by interested researchers.

 

Results

The Questionnaire Results

Table 1 shows demographic information regarding gender, experience, and school type. The non-significance of the Chi square statistic for each variable indicates that the observed differences among the frequency counts are most probably only due to chance, and the questionnaire findings are not influenced by the respondents’ gender, experience, or school type.

Table 1. Descriptive statistics

 

 

F

%

χ2

N

df

Sig.

Gender

Female

Male

94

79

54.3

45.7

1.301

173

1

.254

Experience

10 years or less

11 to 20 years

21 years or more

47

63

56

28.3

38.0

33.7

2.325

166

2

.313

School Type

Public high schools

Private English schools

92

80

53.5

46.5

.837

172

1

.360

F = frequency, % = percent, χ2 = Pearson Chi square statistic, N = valid number of respondents (excluding missing values), df = degrees of freedom, Sig. = significance value.

 

Table 2 provides frequencies for the three questions in Part I of the questionnaire. As is observed, all three questions received a definitive positive response. All respondents, except for three, agreed that they normally arrived at some general understanding of the character type, behavior patterns and achievement patterns of their students. More than 95 percent said they did classify their students into specific categories. And more than 80 percent reported that, based on their perceptions about what category each student represents, they made predictions about the student’s probable level of achievement at the end of the course. As far as the present sample can be considered representative of the larger population of Iranian EFL teachers, the figures found in this section seem to provide a positive answer to our first research question and suggest that Iranian EFL teachers do classify their students in distinct categories.

 

 

Table 2. Frequency of answers to the three questions in Part I of the questionnaire

 

 

F

%

χ2

N

df

Sig.

4. As an EFL teacher, do you obtain a general understanding of the type of personality and behavior or achievement patterns of your students after a while into the course?

Yes

No

172

3

98.3

1.7

163.206

175

1

.000*

5. With this general understanding in mind, do you often categorize your students into specific types?

Yes

No

167

7

95.4

4.0

147.126

174

1

.000*

6. When you perceive a student as belonging in a particular category, do you normally make predictions, in your mind, about how successfully they might finish the course?

Yes

No

143

31

81.7

17.7

72.092

174

1

.000*

* Significant at p < .05 level.

 

Table 3. Means of the criteria in Part II of the questionnaire

Criterion

N

Mean

SD

Perseverance/Studiousness

164

3.62

.639

Attentiveness

168

3.55

.664

Active Participation & Engagement

167

3.46

.766

Interest (in learning English)

166

3.42

.788

Proficiency (in English)

163

3.39

.878

Politeness & Respectfulness

164

3.37

.807

Preparedness

168

3.36

.806

Motivation (for English learning)

166

3.33

.833

Proper Classroom Behavior

167

3.29

.880

Speaking Ability (in English)

169

3.08

1.069

Good Rapport with the Teacher

168

2.98

1.041

Self-confidence/Self-esteem

168

2.98

1.089

Being Friendly/Sociable/Outgoing/Energetic

168

2.92

1.038

Good Pronunciation

169

2.89

1.077

Effective Social Skills

168

2.83

1.100

Intelligence

168

2.61

1.100

Anxiety

167

2.49

1.182

Aggressiveness

168

2.32

1.373

Irritability/Vulnerability/Impulsivity/Emotional Instability

168

2.28

1.105

Playfulness

168

2.18

1.144

Good looks/Appropriate Appearance

168

2.17

1.281

Cronbach Alpha coefficient = .868

153

-

SD = standard deviation

 

Table 3 presents the means and standard deviations for all the criteria in Part II of the questionnaire. The means have been arranged in a descending order to allow better comparison. Clearly, the most important criteria for the EFL teachers were Perseverance/Studiousness, Attentiveness, and Active Participation & Engagement, followed by Interest and Proficiency. The least important criteria for the sample were Anxiety, Aggressiveness, Irritability/Vulnerability/Impulsivity/Emotional Instability, Playfulness, and Good Looks & Appropriate Appearance.

The internal consistency reliability of Part II of the questionnaire was assessed by Cronbach Alpha coefficient (Dörnyei, 2010) and is presented in the last row in Table 3. The obtained .868 value is well above .70 threshold recommended by research methodologists and applied statisticians (Dörnyei & Csizér, 2012; Field, 2013; Hair, Black, Babin, & Anderson, 2010).

 

Exploratory Factor Analysis

In order to investigate whether the 21 criteria are reducible to a set of more inclusive and manageable criteria, an exploratory factor analysis with principal components factoring was executed. At the outset, cases with missing values were deleted listwise in a conservative strategy, and therefore the valid number of cases that remained in the analysis were 153 respondents. Although truncated, this sample size is still by a long way larger than the minimum of 100 respondents that is recommended by some experts (Dörnyei, 2007; O’Rourke & Hatcher, 2013). All communalities were above the .60 threshold, with the exception of Aggressiveness whose communality was only negligibly short (.597), and the mean communality (.711) was above the .70 threshold (c.f. MacCallum, Widaman, Zhang, & Hong, 1999). Therefore, our sample size seemed to be adequate for the exploratory factor analysis.

 

Figure 1. The scree plot

Six factors had eigenvalues greater than one (“Kaiser’s rule”), but this may be an underestimation according to Jolliffe (2002). On the other hand, the scree plot was rather ambiguous, with inflexions that would justify retaining either 5 or 6 factors (Figure 1). Therefore, choosing the six-factor solution seemed like a middle ground option. Given that the 6 factors jointly explained 71.1% of the total variance, more than the stringent 70% threshold recommended by Stevens (2009), it was decided to retain 6 factors for interpretation.

To improve interpretability, the factor structure was transformed using Varimax rotation. Table 4 presents the factor loadings after rotation. As a rule of thumb, Tabachnick and Fidell (2013) recommend that only variables with loadings of .32 and above be interpreted, while Pituch and Stevens (2016) suggest that with relatively small sample sizes, it is sensible to set a more stringent threshold of .50, a strategy that seems more relevant to our relatively small sample size. Therefore, in Table 4, for ease of reading and interpretation, loadings below .30 have been suppressed, and loadings above .50 are shown in bold.

Table 4. Rotated component matrix

 

Component

1

2

3

4

5

6

Preparedness

.855

 

 

 

 

 

Active Participation & Engagement

.797

 

 

 

 

 

Attentiveness

.754

 

.313

 

 

 

Perseverance/Studiousness

.668

 

 

 

 

 

Anxiety

 

.840

 

 

 

 

Irritability/Vulnerability/Impulsivity/Emotional Instability

 

.828

 

 

 

 

Aggressiveness

 

.765

 

 

 

 

Playfulness

 

.725

 

 

 

 

Good Looks & Appropriate Appearance

 

 

.756

.350

 

 

Good Rapport with the Teacher

 

 

.729

.358

 

 

Politeness & Respectfulness

 

 

.658

 

 

 

Proper Classroom Behavior

.436

 

.649

 

 

 

Self-confidence & Self-esteem

.361

 

 

.770

 

 

Being Friendly/Sociable/Outgoing/Energetic

 

 

.313

.756

 

 

Effective Social Skills

 

 

.403

.695

 

 

Intelligence

 

 

 

.535

.451

 

Speaking Ability (in English)

 

 

 

 

.774

 

Proficiency (in English)

 

 

 

 

.728

 

Good Pronunciation

 

 

 

 

.687

 

Interest (in learning English)

 

 

 

 

 

.830

Motivation (for English learning)

.353

 

 

 

 

.741

Extraction method: principal components.

Rotation method: Varimax with Kaiser Normalization. Rotation converged in 11 iterations.

As is observable from Table 4, the variables with the highest loadings on factor 1 are Preparedness, Active Participation & Engagement, Attentiveness, and Perseverance/Studiousness. These items taken together denote diligence, self-discipline, and a sense of responsibility. One cannot but compare these qualities to conscientiousness as a major component of the Big Five model of personality, where it has been described as reflecting persistence, perseverance, dutifulness, reliability, organization, and serious engagement in goal-directed endeavors (Cervone & Pervin, 2013; Costa & McCrae, 1992; McCrae & Costa, 1997, 2008). On account of the striking similarity between dimensions of factor 1 and several facets of the trait conscientiousness as used in personality psychology, we named this factor “Conscientiousness” (C).

The variables that cluster around factor 2 are Anxiety, Irritability/ Vulnerability/ Impulsivity/ Emotional Instability, Aggressiveness, and Playfulness. Since the first three variables are related to affective domains, and Playfulness and Aggressiveness can be considered behavioral dispositions, we preferred to name this factor “Affectively-Induced Behavior” (AIB).

Factor 3 is best represented by Good Looks & Appropriate Appearance, Good Rapport with the Teacher, Politeness & Respectfulness, and Proper Classroom Behavior. The smaller contribution (.403) from Effective Social Skills is above .40, and is considered substantial and deserving a role in interpretation (Field, 2013). Since all these are related to how an individual behaves toward, relates to, and communicates with other people and especially with the teacher, within the social setting of the classroom or the school, we christened this factor “Appropriate Student Conduct” (ASC).

Factor 4 seems a bit more difficult to interpret as the contributing variables do not seem to revolve around a single concept. The variables Self-confidence & Self-esteem, Being Friendly/ Sociable/ Outgoing/ Energetic, Effective Social Skills, and Intelligence have the highest loadings on this factor. The two variables Being Friendly/ Sociable/ Outgoing/ Energetic, and Effective Social Skills seem to be related to an individual’s social behavior, relations, and interactions. In order to achieve a more representative name, we chose “Sociability, Self-confidence & Intelligence” (SSI) to name this factor. After all, not everybody’s name is attractive.

Factor 5 is best represented by the three proficiency-related items, Speaking Ability, Proficiency, and Good Pronunciation. When there are fewer than four variables loading on a factor, suspicions about factor reliability arise. The average of the four largest loadings on this factor is .66, which is above the .60 threshold recommended by Stevens (2009, p. 333), and therefore the factor can be considered reliable.

Although one may be tempted to opt for verbal proficiency for factor 5 because of the influence of Speaking Ability and Good Pronunciation, it may be a more justifiable position to take into account the comparatively minor, but still substantial, influence from Intelligence (.451) and view the factor as representing a more encompassing criterion. Therefore, the more inclusive term “English Proficiency” (EP) seemed to better fit the factor.

Finally, factor 6 is merely represented by two items: Interest and Motivation. Tabachnick and Fidell (2013) suggest that a factor with two variables should be considered reliable only when the variables are highly correlated with each other, but relatively uncorrelated with other variables. With information from the correlation matrix (not reported here), it seems that this requirement is fulfilled: Motivation shows the highest correlation with Interest (.641), with its correlation with the other variables in the range of or below .40. The same condition applies with Interest, where its highest correlation with any variable, other than Motivation, is .45 with Perseverance/Studiousness. Therefore, it seems justified to consider the sixth factor as a reliable factor.

The high correlation between Motivation and Interest, and their ending up on one factor is unsurprising as interest is considered a subset of motivation (Ainley, 2012). Interest has been referred to in the literature as a “unique motivational variable” (Hidi, 2006) whose function is to motivate the individual to explore the environment, to learn about it, and to develop a repertoire of knowledge, skills, and experience. It is therefore considered a major source of intrinsic motivation for learning (Hidi, 2000; Silvia, 2006, 2012; Ryan & Deci, 2017). Besides interest, however, there are a myriad of other emotional, cognitive, social, and physiological factors that affect motivation (Bernstein, Penner, Clarke-Stewart, & Roy, 2012). Therefore, in order to take account of the fact that motivation and interest are two different but related entities, we chose to include both terms in the name for factor 6: “Interest & Motivation” (IM).

Using the item means from Table 3, we may at this point be able to figure out an index of importance for the six composite classification criteria. If we compute the average of the item means for the major items contributing to each factor (i.e. items with factor loadings above .50), we can obtain an approximate measure of the importance of the composite criterion representing that factor. By the application of such a rough index, it was revealed that Conscientiousness was the most important criterion for the sample EFL teachers with the highest average item mean of 3.50, and it was followed by Interest & Motivation (3.38), English Proficiency (3.12), Appropriate Student Conduct (2.95), Sociability, Self-confidence & Intelligence (2.84), and finally Affectively-Induced Behavior (2.32).

 

The Case Study

The case study consisted of a correlation analysis and a cluster analysis on the peer rating data.

 

Correlations

Table 5 presents correlation coefficients between pairs of variables obtained from the peer ratings, but correlations with two additional variables have also been reported. The variable labeled as EFL Achievement (EFLA) is the average of the students’ scores from the set of four English achievement written exams that were administered by the school, observing formal exam procedures. The exams had been administered with two-month intervals during the same academic year, and they had been scored by the class’s EFL teacher.

The second additional variable is GPA, the average of all annual grade points of a student obtained from 15 subject matters including: religious studies, Arabic, Persian literature, literary techniques, writing (in Persian), English, math and statistics, physical education, defense preparation, media literacy, history, sociology, geography, economics, and logic (Table of Subject Matters and Weekly Teaching Hours for the Second Secondary Education Program, 2016).

From Table 5, it is clear that AIB has negatively correlated with every other variable, and most strongly with C, ASC, and EP (r = -.478, r = -.461, and r = -.394, respectively). It is also negatively, though insignificantly, correlated with GPA, and EFLA (r = -.231 and r = -.137, respectively).

Table 5. Correlation coefficients for peer rating variables, EFLA, and GPA

 

C

SSI

AIB

ASC

EP

IM

EFLA

GPA

C

1

 

 

 

 

 

 

 

SSI

.772**

1

 

 

 

 

 

 

AIB

-.478*

-.111

1

 

 

 

 

 

ASC

.855**

.864**

-.461*

1

 

 

 

 

EP

.818**

.692**

-.394*

.737**

1

 

 

 

IM

.751**

.684**

-.330

.709**

.940**

1

 

 

EFLA

.701**

.622**

-.137

.541**

.768**

.733**

1

 

GPA

.844**

.751**

-.231

.683**

.649**

.586**

.748**

1

C = Conscientiousness, IM = Interest & Motivation, EP = English Proficiency, ASC = Appropriate Student Conduct, SSI = Sociability, Self-confidence & Intelligence, AIB = Affectively-Induced Behavior, EFLA = EFL Achievement, and GPA = Grade Point Average.

** Significant at the 0.01 level (2-tailed).

* Significant at the 0.05 level (2-tailed).

Apart from AIB, all the other variables have high inter-correlations that are all significant at
p < .01 level. Most noticeably, Conscientiousness is found to be the strongest predictor of GPA (r = .844), but four other variables, SSI (r = .751), ASC (r = .683), EP (r = .649), and IM
(r = .586) also demonstrate strong claims to predicting GPA.

The strongest predictor of achievement in English was, predictably, English Proficiency
(r = .768), but it was closely followed by IM (r = .733) and C (r = .701), while SSI (r = .622) and ASC (r = .541) were fairly accurate predictors. These findings taken together are clear evidence for the predictive utility of the mentioned set of five classification criteria.

 

The Cluster Analysis

In order to investigate how well the set of six composite criteria obtained from the factor analysis could function in classifying the students, the mean ratings given in Appendix 3 were entered into a cluster analysis.

 

Clustering Procedures

Sarstedt and Mooi (2014) and Hahs-Vaughn (2017) suggest that variables with a correlation coefficient in excess of .90 should not be entered into the same cluster analysis. Since EP and IM were highly correlated (r = .940, see Table 5), it was decided to omit IM from the analysis.

Since there were a limited number of observations in the dataset, a hierarchical agglomerative technique was adopted initially (Everitt et al., 2011; Sarstedt & Mooi, 2014). As the similarity measure for our five continuous variables, the squared Euclidean distance was chosen, which is the most commonly used measure in cluster analytic studies (Garson, 2014).

The next step was to choose an appropriate clustering algorithm. Everitt et al. (2011) contend that the average linkage method is relatively robust and takes account of the cluster structure. They also refer to empirical research from various fields and conclude that Ward’s method, complete linkage (i.e. furthest neighbor), and average linkage generate more interpretable results as compared with other algorithms. It was decided to run the analysis with all the above three methods to be able to check the stability of the results.

Since both Ward’s method and complete linkage are sensitive to outliers (Hahs-Vaughn, 2017), an extreme-value analysis using the z-score distribution (Tabachnick & Fidell, 2013) was conducted separately on each variable to detect outliers. No extreme values smaller than -2.5 or greater than +2.5 threshold value (a most stringent standard suggested by Hair et al., 2010, for small samples of fewer than 80 observations) was detected in the standard score distribution. Therefore, the influence of outliers seemed to be of little concern for this analysis.

 

Figure 2. Dendrogram for Ward’s method

 

All the three clustering algorithms generated identical results. As an example, Figures 2 illustrates the dendrogram for the hierarchical clustering using Ward’s method. By visual examination, three major clusters are clearly discernible toward the leftmost side of the diagram. Cluster members did not change across the three algorithms, indicating a stable clustering. The cluster affiliations are provided in Table 6 (second column from the left). Members of each cluster have been designated with numbers 1 through 3. There are 12 members in Cluster 1, 8 members in Cluster 2, and 6 members in Cluster 3.

 

Table 6. Cluster memberships

Name

Hier-EP with AIB

Hier-EP no AIB

K-EP no AIB

Hier-IM no AIB

Teacher’s
4-group

Mohsen

1

1

1

1

1

Hossein

1

1

1

1

2

Jalil

1

1

1

1

4

Pouya

1

1

1

1

1

Mobin

1

1

1

1

1

Bahman

1

1

1

1

4

Aarash

1

1

1

1

2

Ahmad

1

1

1

1

1

Jafar

1

1

1

1

2

Ali

1

1

1

1

2

Mahdi

1

1

1

1

4

Omid

1

1

1

1

4

Navid

2

2

2

2

1

Hasan

2

2

2

2

1

Mojib

2

2

2

2

4

Iman

2

2

2

1

2

Nima

2

2

2

2

3

Emaad

2

2

2

2

1

Kia

2

2

2

2

3

Kourosh

2

2

2

2

2

Reza

3

3

3

3

3

Naser

3

3

3

3

3

Farid

3

3

3

3

3

Behzaad

3

3

3

3

3

Saam

3

3

3

3

4

Raamin

3

3

3

3

3

Hier-EP with AIB: Clusters found by hierarchical procedures, with EP, C, SSI, ASC, and AIB.

Hier-EP no AIB: Clusters found by hierarchical procedures, with EP, C, SSI, and ASC.

K-EP no AIB: Clusters found by the k-means procedure, with EP, C, SSI, and ASC.

Hier-IM no AIB: Clusters found by hierarchical procedures using complete linkage, average linkage, and centroid clustering, while excluding AIB.

Teacher’s 4-group: The four-group classification generated by the class’s EFL teacher (see Section 3.2.2.2).

 

Cluster Validation

External criteria can help substantiate the emerging clusters (Milligan, 1996). In a criterion validation attempt, the EFL teacher of the class (male, aged 55, 30 years of EFL teaching experience) was asked to classify the students based on the six composite criteria. Since the most important criterion for the teacher was English proficiency, he first made a five-category classification according to English Proficiency, then made a few adjustments in the categories according to the other five criteria, and eventually decided that a four-group classification truly represented his understanding of his students. The teacher’s classification is presented in Table 6 (the first column from the right). A visual comparison of the teacher’s four-group classification with the hierarchical clusterings (the second column from the left) reveals that there are 15 misclassifications (equal to 57.69% of the sample), which is not particularly desirable.

As another validation strategy, means of the three identified clusters (the clusters given under “Hier-EP with AIB” in Table 6) were subjected to a one-way analysis of variance (ANOVA). Table 7 presents the means of each cluster on all the variables, along with the ANOVA F-ratios.

Since the clusters have been computed so as to be maximally different, Csizér and Jamieson (2013) caution that interpreting the ANOVA results as significant differences between the groups is inappropriate. However, EFLA and GPA were not involved in the cluster analysis. Therefore, the significance of their F-ratios (F = 10.685 and F = 16.346, respectively) at p < .05 level indicates the existence of actual and meaningful differences among the three clusters. Significant mean differences on these two non-clustering variables definitely cannot be considered some statistical artifact resulting from the particular clustering procedures. Therefore, this finding adds further validity to the obtained cluster solution.

Table 7. Cluster means and ANOVA results

 

Cluster 1

N = 12

Cluster 2

N = 8

Cluster 3

N = 6

F

Sig.

C

33.41

53.10

78.15

35.328

.000*

SSI

42.22

62.28

75.93

61.250

.000*

AIB

36.89

41.62

25.85

2.675

.090

ASC

46.81

60.07

78.26

48.376

.000*

EP

26.28

36.45

69.24

43.715

.000*

IM

30.78

40.79

69.61

23.666

.000*

EFLA

49.45

64.40

80.73

10.685

.001*

GPA

70.56

83.59

90.78

16.346

.000*

C = Conscientiousness, IM = Interest & Motivation, EP = English Proficiency, ASC = Appropriate Student Conduct, SSI = Sociability, Self-confidence & Intelligence, AIB = Affectively-Induced Behavior, EFLA = EFL Achievement, and GPA = Grade Point Average.

* Significant at p < .05 level.

 

Perhaps the most conspicuous piece of information in Table 7 is the non-significance of the F-ratio for AIB (F = 2.675, p < .05). This indicates that the variable could hardly distinguish between members of the three clusters. Following this finding, another round of hierarchical clusterings, with the same options as before, were carried out without AIB. The obtained clusters are presented in Table 6 (under Hier-EP no AIB). No change was observed in the number of clusters or cluster affiliations, indicating that the variable AIB was so incapable of effectively differentiating across the clusters that we could just as well dispense with it and still obtain the same clustering. Therefore, it was decided to exclude AIB from the cluster analysis.

A number of experts recommend the use of a k-means clustering after a hierarchical procedure to validate the clusterings (Garson, 2014; Hahs-Vaughn, 2017; Sarstedt & Mooi, 2014). Therefore, a k-means clustering analysis was carried out with the same variables (excluding AIB) with a k value of 3 for the number of clusters. No results altered. Cluster memberships are presented in Table 6 (under K-EP no AIB). This lends further validity to the three-cluster solution generated by the hierarchical procedure.

For further exploration and validation, the same stages of hierarchical and k-means analysis with the same options were repeated with Interest & Motivation (instead of English Proficiency) along with the other three variables (Conscientiousness, Sociability, Self-confidence & Intelligence, Appropriate Student Conduct), while excluding Affectively-Induced Behavior (AIB). The cluster memberships are presented in Table 6 (under Hier-IM no AIB). Exactly identical results were obtained with Ward’s method (not reported in Table 6), and there was only one misclassification (Iman) with complete linkage, average linkage, and centroid clustering. This indicated that the clustering solution was robust enough to remain stable across three different hierarchical clustering algorithms and a k-means analysis, with or without Affectively-Induced Behavior, and with either of English Proficiency or Interest & Motivation included as a clustering variable.

As a rule of thumb, Sarstedt and Mooi (2014) and Hahs-Vaughn (2017) caution that the sample size for cluster analysis should not be smaller than 2m, where m is the number of variables. This means that, for example, with 4 variables the sample size should be 16 or larger, and with 5 variables there should be at least 32 observations. Therefore our sample of 26 students seemed to be adequate, since using either four or five variables with various configurations led to exactly identical cluster solutions.

In order to investigate how well the three clusters were differentiated, Scheffé and Tukey post hoc analyses were performed on EFLA and GPA. The analyses generated identical results; therefore, only the results for the Scheffé were reported in Table 8.

 

 

 

Table 8. Scheffé results between means of the 3 clusters on EFLA and GPA

Dependent Variable

(I) Hierarchical 3-cluster solution

(J) Hierarchical 3-cluster solution

Mean Difference (I-J)

Std. Error

Sig.

EFLA

1

2

-14.94750

6.26496

.079

3

-31.27604*

6.86292

.001

2

1

14.94750

6.26496

.079

3

-16.32854

7.41280

.111

3

1

31.27604*

6.86292

.001

2

16.32854

7.41280

.111

GPA

1

2

-13.02500*

3.43648

.004

3

-20.21250*

3.76448

.000

2

1

13.02500*

3.43648

.004

3

-7.18750

4.06610

.231

3

1

20.21250*

3.76448

.000

2

7.18750

4.06610

.231

* The mean difference is significant at the 0.05 level.

 

With EFLA, only the means for Clusters 1 and 3 were significantly different, whereas with GPA, Cluster 1 was significantly different from both the other two. But there was no significant difference between Clusters 2 and 3 on either variable.

A rational index can be easily calculated from Table 8, where it is observed that the three-cluster solution has correctly predicted 6 out of 12 possible mean differences, equal to 50%. Since the cluster solution is generated so as to maximize differences among the clusters and yet the three-cluster solution predicts only half of the mean differences between the clusters on the two achievement-related non-clustering variables, one might wonder whether more accurate predictions could be achieved with the teacher’s four-group classification.

To investigate the issue, the means of the four groups identified by the teacher’s classification on the 8 variables, given in Table 7, were subjected to another one-way ANOVA. All F-ratios came out significant except for that of AIB (the same as what was found with the three-cluster solution). Two subsequent post hoc analyses (Scheffé and Tukey) were performed and revealed identical results (Appendix 4, only the Scheffé results are reported). It was revealed that, with the teacher’s four-group classification, it was only Cluster 3 that was somewhat differentiated from the other three. But Clusters 1, 2, and 4 were not differentiated on the two non-clustering variables, indicating that the teacher’s four-group classification is in effect a two-group classification, with Clusters 1, 2, and 4 in one group and Cluster 3 in another.

A similar rational measure of the discriminatory power of the teacher’s classification was calculated and it was found that his classification correctly predicted 10 out of 24 possible mean differences, equal to 41.66%.

To sum up, the three-cluster solution based on peer ratings seemed to somewhat accurately predict the actual achievement-related variations among the groups of students. It also appeared to enjoy a higher discriminatory power than the four-group classification generated by the class’s EFL teacher. Consequently, the three-cluster classification was adopted for further interpretation.

 

Cluster Interpretation

To enhance comparison, Figure 3 presents anew the cluster means from Table 7, without the means for AIB.

 

 

Figure 3. Cluster means

 

As is evident from the figure, members of Cluster 3 have the highest means on all the variables, while members of Cluster 1 appear to sit on the exact opposite end of the cline. Cluster 2 members are right in the middle of the two extremes on every variable. This uniform pattern of variation among the three groups across all the variables may be attributed to the high inter-correlations found between the variables (see Table 5).

 

Discussion

The Iranian EFL teachers participating in the questionnaire survey almost unanimously agreed that they did classify their students according to their understanding of the character type, behavior patterns, and achievement patterns of their students. This is to be expected as classification of students in distinct categories liberates teachers from the time-consuming process of considering the characteristics of each individual student separately. Practical classroom limitations often prevent teachers from taking students’ individual differences into account. Biedroń and Pawlak (2016) enumerate a number of such limitations: “the lack of time, the need to achieve curricular goals, focus on exam requirements, the additional burden of preparing extra materials that would cater to individual learner profiles, the unfeasibility of individualization in large classes, or simply resistance and lack of interest on the part of students” (p. 413).

Classification of students in categories may be a reflection of a more general inclination in human beings to describe people by referring to their personality traits (e.g. relaxed, reserved, competitive, generous). Theories of personality, including the Big Five model, have been formulated based on trait differences among individuals (Cervone & Pervin, 2013; Ewen, 2010). In everyday life, we normally use trait terms (e.g. diligent, reliable, caring) to describe the kind of behavior that can be expected from a person most of the time, for instance when we introduce a friend or write a recommendation letter (Schultz & Schultz, 2013).

A similar concept from personality psychology that may help to explain our findings is personality type (Maltby, Day, & Macaskill, 2017). It may be argued that just as a person’s personality type is a summary of how the person stands simultaneously on a range of personality traits (Piedmont, 1998; Funder, 2013), it can be inferred by analogy that the way one stands on a constellation of cognitive, motivational, and behavioral learner characteristics can be viewed as the individual’s learner type. Given this speculation is confirmed, the uniform pattern of covariation displayed by the conglomerate of language learner characteristics across the three clusters could be an indication of the existence of types of EFL learners. Skehan (1991) alluded to a similar position when he argued that through cluster analysis we may be able to find learner types in terms of “configurations of ability” that contribute to L2 learning. Likewise, Csizér and Dörnyei (2005) interpreted their identified “learner motivational profiles” as instances of learner types.

Further research is thus required to analyze a more comprehensive list of learner characteristics obtained from specifically-designed instruments to investigate the phenomenological reality of EFL learner types. And such a pursuit would not be unrealistic, since educational researchers have already identified and empirically validated typologies of college students (Astin, 1993; Kuh et al., 2000; Luo & Jamieson-Drake, 2005; Luan et al., 2009). Moreover, in psychology, a discipline not too far away from applied linguistics, personality type measures like Myers-Briggs Type Indicator (MBTI, 2017) are widely employed by practitioners to capture the diverse range of variations in human characteristics in just a few types.

It may be argued that the uniform variation of the cluster means across all the variables could be due to some other underlying umbrella factor, such as language learning aptitude or intelligence, that was not controlled for in this study. Larger factor analytic studies specifically designed to investigate such a relationship are, of course, in order. But even if such studies can verify this contention, an adequate theory is required to explain how a diverse set of apparently independent factors (i.e. conscientiousness; sociability, self-confidence and intelligence; interest and motivation; and appropriate student conduct) are found to correlate with one another so strongly as to cause such a dramatic variation across groups of language learners. Advances in SLA research from a complex dynamic systems perspective may be particularly illuminating in this respect, as researchers try to explain how interactions among various attributes within the individual give rise to a complex whole whose properties are different from the sum of its components’ properties (Dörnyei, MacIntyre, & Henry, 2015; Larsen-Freeman & Cameron, 2008; Verspoor, de Bot, & Lowie, 2011).

Although correlation coefficients for the criterion Conscientiousness in this study were obtained from student peer ratings, and not from established Big Five self-report measures such as Revised NEO Personality Inventory (Costa & McCrae, 1992; Piedmont, 1998) or International Personality Item Pool (IPIP, 2017; Goldberg et al., 2006), the fact that Conscientiousness was found in this study to be the strongest predictor of GPA is consistent with research results that found the Big Five conscientiousness to predict scholastic/academic achievement (Dumfart & Neubauer, 2016; Noftle & Robins, 2007; O’Connor & Paunonen, 2007), and to do so independently of intelligence (Poropat, 2009). With such strong predictive power, it is not surprising that Conscientiousness was found to be the most important criterion for the EFL teachers in this study.

The second most important criterion for the EFL teachers was Interest & Motivation and the students’ peer ratings on this variable was fairly highly correlated (r = .586) with their GPA. This finding is also consistent with research that has found interest to predict academic achievement (Hidi & Harackiewicz, 2000; Schiefele, Krapp, & Winteler, 1992) and motivation to be closely associated with learning gains (Deci & Ryan, 1985; Wigfield, Cambria, & Eccles, 2012) and particularly L2 learning achievement (Masgoret & Gardner, 2003).

The least important criterion for the survey respondents was Affectively-Induced Behavior. This composite criterion is composed of Anxiety, Aggressiveness, Playfulness, and Irritability/Vulnerability/Impulsivity/Emotional Instability. At least one of these criteria, anxiety, has often been reported to negatively correlate with language learning (Horwitz, 2001; Matsuda & Gobel, 2004). However, from Table 3 it is clear that even Anxiety was not among the most important criteria for the teacher respondents.

Further, Affectively-Induced Behavior was the only variable that could not differentiate among the three clusters identified in the case study class. Moreover, the cluster solution remained stable with or without this variable. These findings taken together indicate that the validity of using this criterionfor classification purposes is questionable. The unsatisfactory results with this variable could be attributed to the fact that this composite criterion is composed of a set of rather remotely connected or even contrastive criteria (as in the case of Anxiety vs. Playfulness). Consequently it might have been confusing for the students to rate their peers according to all the four constituent criteria at the same time.

However, the fact that besides Conscientiousness and Interest & Motivation, the three other variables Appropriate Student Conduct, English Proficiency,and Sociability, Self-confidence & Intelligence were (fairly) strongly associated with both GPA and EFL Achievement is in itself compelling evidence of the predictive utility of this set of five major classification criteria. Further, the same set of five criteria could deliver a three-cluster classification whose groups demonstrated actual and meaningful differences in English achievement and general academic performance, and were more accurately differentiated when compared with the teacher’s four-group classification. This further substantiates the validity and usefulness of the set of five criteria for classification purposes.

 

Conclusions, Limitations, and Future Directions

It was revealed in this study that EFL teachers do normally classify their students based on individual learner attributes. The learner characteristics that were most widely employed by the EFL teachers as criteria to categorize their students were identified. Some of these criteria such as anxiety, intelligence, motivation, interest, sociability, self-esteem, and self-confidence are ID factors (Dewaele, 2013; Dörnyei, 2005; Dörnyei & Ryan, 2015). However, the others do not seem to fit within an ID framework, rather they are more linguistic, social, or pedagogic in nature (e.g., English proficiency, social skills, proper classroom behavior, politeness, respectfulness, rapport, good looks and appearance, preparedness, active participation and engagement, perseverance, and attentiveness). There thus seems to be a wide gap between the set of learner characteristics that EFL teachers find instrumental for classification purposes and the learner attributes that have been the focus of most applied linguistics research. This finding, coupled with our case study results, seems to indicate that even if actual EFL learner types exist, much more than a set of ID variables is required to clearly distinguish them.

The non-probability sampling strategy that was adopted in the survey study casts serious doubts on representativeness and questions the generalizability of the findings. However, the findings with Part I of the questionnaire survey were largely unanimous within our sizeable quota sample and it is highly unlikely that research studies with a more representative sample could find dramatically different results. But findings with Part II of the questionnaire are particularly prone to generalizability concerns. Specifically, the six classification criteria identified from the factor analysis of Part II responses could vary with larger, more representative samples.

However, our research provides a good foundation for future studies on EFL/ESL teachers’ classification criteria, by initially identifying a set of 21 criteria from the preliminary interviews. Future research could add other classification criteria to this list to provide richer grounds for factor analytic investigations. Particularly studies in an ESL context could offer invaluable insights in this regard.

The factor analysis results revealed a set of more general composite classification criteria, which subsequently proved effective in predicting language learning achievement. In retrospect, we believe, had we done more research in the field to identify more criteria, or had we not combined all too many of the initial criteria obtained from the interviews, we might have been able to achieve more interpretable factors or more efficient composite classification criteria.

This study extends our understanding of the actual practice of language teachers. It takes the first step in documenting how language teachers actually classify their students and view them as belonging to distinct categories. And it pinpoints a few most important criteria that EFL teachers use to categorize students. Continued research along this path could bridge the gap between the actual language teaching practice and applied linguistics research findings, could help practitioners develop feasible pedagogical intervention strategies, and could assist language teachers in responding to various learner psychological profiles by adjusting their instructional practices.

Ainley, M. (2012). Students’ interest and engagement in classroom activities. In S. L. Christenson, A. L. Reschly, & C. Wylie (Eds.), Handbook of research on student engagement (pp. 283–302). New York, NY: Springer.

Aliaga, O. A., Kotamraju, P., & Stone, J. R., III. (2012). A typology for understanding the career and technical education credit-taking experience of high school students. Louisville, KY: National Research Center for Career and Technical Education, University of Louisville.

Ary, D., Jacobs, L. C., & Sorensen, C. (2010). Introduction to research in education (8th ed.). Belmont, CA: Wadsworth, Cengage Learning.

Astin, A. W. (1993). An empirical typology of college students. Journal of College Student Development, 34, 36-46.

Bernstein, D. A., Penner, L. A., Clarke-Stewart, A., & Roy, E. J. (2012). Psychology (9th ed.). Belmont, CA: Wadsworth, Cengage Learning.

Biedroń, A., & Pawlak, M. (2016). The interface between research on individual difference variables and teaching practice: The case of cognitive factors and personality. Studies in Second Language Learning and Teaching, 6(3), 395-422. http://dx.doi.org/10.14746/sllt.2016.6.3.3

Brown, J. D. (2013). Sampling: Quantitative methods. InC. A. Chapelle (Ed.), The encyclopedia of applied linguistics (pp. 1-6). Chichester, West Sussex, U.K.: Wiley-Blackwell.

Cervone, D., & Pervin, L. A. (2013). Personality: Theory and research (12th ed.). New York: Wiley.

Cheong, K., & Ong, B. (2014). Pre-college profiles of first year students: A typology. Procedia -Social and Behavioral Sciences, 123, 450 – 460. https://doi.org/10.1016/j.sbspro.2014.01.1444

Coombe, C., & Davidson, P. (2015). Constructing questionnaires. In J. D. Brown & C. Coombe (Eds.), The Cambridge guide to research in language teaching and learning (pp. 217-223). Cambridge, U.K.: Cambridge University Press.

Costa, P. T., Jr., & McCrae, R. R. (1992). NEO PI-R professional manual. Odessa, FL: Psychological Assessment Resources, Inc.

Csizér, K., & Dörnyei, Z. (2005). Language learners’ motivational profiles and their motivated learning behavior. Language Learning, 55(4), 613–659.

Csizér, K., & Jamieson, J. (2013). Cluster analysis. In C. A. Chapelle (Ed.), The encyclopedia of applied linguistics. Blackwell Publishing. https://doi.org/ 10.1002/ 9781405198431. wbeal0138

Deci, E. L., & Ryan, R. M. (1985). Intrinsic motivation and self-determination in human behavior. New York: Plenum Press.

Dewaele, J.-M. (2013). Learner internal psychological factors. In J. Herschensohn, & M. Young-Scholten (Eds.), The Cambridge handbook of second language acquisition (pp. 159-179). Cambridge: Cambridge University Press.

Dörnyei, Z. (2005). The psychology of the language learner: Individual differences in second language acquisition. Mahwah, NJ: Lawrence Erlbaum.

Dörnyei, Z. (2007). Research methods in applied linguistics: Quantitative, qualitative and mixed methodologies. Oxford: Oxford University Press.

Dörnyei, Z. (2010). Questionnaires in second language research: Construction, administration, and processing (2nd ed.). New York and London: Routledge.

Dörnyei, Z. (2014). Researching complex dynamic systems: “Retrodictive qualitative modelling” in the language classroom. Language Teaching, 47(1), 80–91. https://doi.org/10.1017/S0261444811000516

Dörnyei, Z., & Csizér, K (2012). How to design and analyze surveys in second language acquisition research. In A. Mackey & S. Gass (Eds.), Research methods in second language acquisition: A practical guide (pp. 74-94). Oxford, U.K.: Wiley-Blackwell.

Dörnyei, Z., MacIntyre, P. D., & Henry, A. (2015). Motivational dynamics in language learning. Bristol: Multilingual Matters.

Dörnyei, Z., & Ryan S. (2015). The psychology of the language learner revisited. New York, NY: Routledge.

Dörnyei, Z., & Skehan, P. (2003). Individual differences in second language learning. In C. J. Doughty & M. H. Long (Eds.), The handbook of second language acquisition (pp. 589–630). Oxford: Blackwell.

Dumfart, B., & Neubauer, A. C. (2016). Conscientiousness is the most powerful noncognitive predictor of school achievement in adolescents. Journal of Individual Differences, 37(1), 8-15. https://doi.org/10.1027/1614-0001/a000182

Ellis, N. C., & Larsen-Freeman, D. (2006). Language emergence: Implications for applied linguistics. Introduction to the special issue. Applied Linguistics, 27(4), 558–589.

Everitt, B. S., Landau, S., Leese, M., & Stahl, D. (2011). Cluster analysis (5th ed.). Chichester, U.K.: John Wiley & Sons.

Ewen, R. B. (2010). An introduction to theories of personality (7th ed.). New York, NY: Psychology Press.

Field, A. (2013). Discovering statistics using IBM SPSS statistics: And sex and drugs and rock 'n' roll (4th ed.). London, U.K., and Thousand Oaks, CA: Sage.

Funder, D. C. (2013). The personality puzzle (6th ed.). New York, NY, and London, U.K.: W. W. Norton & Company.

Garson, G. D. (2014). Cluster analysis. Asheboro, NC: Statistical Associates Publishing.

Goldberg, L. R., Johnson, J. A., Eber, H. W., Hogan, R., Ashton, M. C., Cloninger, C. R., & Gough, H. C. (2006). The International Personality Item Pool and the future of public-domain personality measures. Journal of Research in Personality, 40, 84-96.

Good, T.L., & Power, C. (1976). Designing successful classroom environments for different types of students. Journal of Curriculum Studies,8, 1-16.

Google Forms (n.d.). Retrieved 26 June, 2017 at: https://www.google.com/forms/about/

Gregersen, T., & MacIntyre, P. D. (2014). Capitalizing on language learners’ individuality: From premise to practice. Bristol: Multilingual Matters.

Hahs-Vaughn, D. L. (2017). Applied multivariate statistical concepts. New York, NY, and Abingdon, U.K.: Routledge.

Hair, Jr., J. F., Black, W. C., Babin, J. B., & Anderson, R. E. (2010). Multivariate data analysis (7th ed.). Upper Saddle River, NJ: Prentice Hall.

Harnad, S. (2005). To cognize is to categorize: Cognition is categorization. In h. Cohen & C. Lefebvre (Eds.), Handbook of categorization in cognitive science (pp. 19-43). Oxford, U.K.: Elsevier.

Hidi, S. (2000). An interest researcher's perspective: The effects of extrinsic and intrinsic factors on motivation. In C. Sansone & J. M. Harackiewicz (Eds.), Intrinsic and extrinsic motivation (pp. 309–339). San Diego: Academic Press.

Hidi, S. (2006). Interest: A unique motivational variable. Educational Research Review, 1 (2), 69–82. https://doi.org/10.1016/j.edurev.2006.09.001

Hidi, S., & Harackiewicz, J. M. (2000). Motivating the academically unmotivated: A critical issue for the 21st century. Review of Educational Research, 70, 151–179. DOI: https://doi.org/10.3102/00346543070002151

Hjørland, B. (2017). Classification. Knowledge Organization, 44(2), 97-128. Retrieved online on 20 June, 2017 at: http://www.isko.org/cyclo/classification. In PDF format, retrieved online on 20 June, 2017 at: http://www.isko.org/ko442toc.pdf

Horwitz, E. K. (2001). Language anxiety and achievement. Annual Review of Applied Linguistics, 21, 112–126. https://doi.org/10.1017/S0267190501000071

Hu, S., & McCormick, A. C. (2012). An engagement-based student typology and its relationship to college outcomes. Research in Higher Education, 53(7), 738–754. https://doi.org/10.1007/s11162-012-9254-7

International Personality Item Pool (2017, February 22). Retrieved 22 July, 2017 from: http://ipip.ori.org/

Jolliffe, I. T. (2002). Principal component analysis (2nd ed.). New York, NY, and Berlin, Germany: Springer-Verlag.

Kendig, C. (2016). Activities of kinding in scientific practice. In C. Kendig (Ed.), Natural kinds and classification in scientific practice (pp. 1-13). Abingdon, U.K., and New York, NY: Routledge. 

Kuh, G. D., Hu, S., & Vesper, N. (2000). "They shall be known by what they do": An activities-based typology of college students. Journal of College Student Development, 41, 228-244.

Labaree, R. V. (2017, June 14). Organizing your social sciences research paper: Types of research designs. On the University of Southern California website. Retrieved 28 June, 2017 at: http://libguides.usc.edu/writingguide/researchdesigns

Larsen-Freeman, D., & Cameron, L. (2008). Complex systems and applied linguistics. Oxford: Oxford University Press.

Luan, J., Zhao, C., & Hayek, J. (2009). Using a data mining approach to develop a student engagement-based institutional typology, IR Applications, 18. Tallahassee, FL: Association for Institutional Research. Retrieved 3 August, 2017 at: http://files.eric.ed.gov/fulltext/ED504332.pdf

Luo, J., & Jamieson-Drake, D. (2005). Linking student precollege characteristics to college development outcomes: The search for a meaningful way to inform institutional practice and policy, IR Applications, 7. Tallahassee, FL: Association for Institutional Research. Retrieved 3 August, 2017 at: http://files.eric.ed.gov/fulltext/ED504375.pdf

MacCallum, R. C., Widaman, K. F., Zhang, S., & Hong, S. (1999). Sample size in factor analysis. Psychological Methods, 4(1), 84-99. https://doi.org/10.1037/1082-989X.4.1.84

Maltby, J., Day, L., & Macaskill, A. (2017). Personality, individual differences and intelligence (4th ed.). Harlow, U.K.: Pearson Education.

Masgoret, A.-M., & Gardner, R. C. (2003). Attitudes, motivation, and second language learning: A meta-analysis of studies conducted by Gardner and his associates. Language Learning, 53 (Supplement 1), 167–210.

Matsuda, S., & Gobel, P. (2004). Anxiety and predictors of performance in the foreign language classroom. System, 32, 21-36. https://doi.org/10.1016/j.system.2003.08.002

McCrae, R. R., & Costa, P. T. (1997). Personality trait structure as a human universal. American Psychologist, 52, 509–516. https://doi.org/10.1037//0003-066X.52.5.509

McCrae, R. R., & Costa, P. T., Jr. (2008). The five-factor theory of personality. In O. P. John, R. W. Robins, & L. A. Pervin (Eds.), Handbook of personality: Theory and research (3rd ed., pp. 159–181). New York: Guilford Press.

McKeachie, W. J., & Svinicki, M. (2014). McKeachie’s teaching tips: Strategies, research, and theory for college and university teachers (14th ed.). Belmont, CA: Wadsworth, Cengage Learning.

Milligan, G. (1996). Clustering validation: Results and implications for applied analyses. In P. Arabie, L. Hubert, & G. De Soute (Eds.), Clustering and classification (pp. 341–75). River Edge, NJ: World Scientific.

Myers-Briggs Type Indicator (2017). Retrieved 23 June, 2017 at: https://www.cpp.com/ products/ mbti/index.aspx

Neuman, W. L. (2014). Social research methods: Qualitative and quantitative approaches (7th ed.). Harlow, Essex, UK: Pearson Education.

Nielsen, J. (2004, February 2). Keep online surveys short. Retrieved 26 June, 2017 at:https://www.nngroup.com/articles/keep-online-surveys-short/

Noftle, E. E., & Robins, R. (2007). Personality predictors of academic outcomes: Big Five correlates of GPA and SAT scores. Journal of Personality and Social Psychology, 93, 116–130. https://doi.org/10.1037/0022-3514.93.1.116

O’Connor, M. C., & Paunonen, S. V. (2007). Big Five personality predictors of post-secondary academic performance. Personality and Individual Differences, 43, 971–99. https://doi.org/10.1016/j.paid.2007.03.017

O’Rourke, N., & Hatcher, L. (2013). A step-by-step approach to using SAS for factor analysis and structural equation modeling (2nd ed.). Cary, NC, U.S.: SAS Institute.

Piedmont, R. L. (1998). The Revised NEO Personality Inventory: Clinical and research applications. New York, NY: Springer Science + Business Media.

Pituch. K. A., & Stevens, J. P. (2016). Applied multivariate statistics for the social sciences: Analyses with SAS and IBM's SPSS (6th ed.). New York, NY, and Abingdon, U.K.: Routledge.

Poropat, A. E. (2009). A meta-analysis of the Five-Factor Model of personality and academic performance. Psychological Bulletin, 135, 322–338. https://doi.org/10.1037/a0014996

Richards, J. C., & Lockhart, C. (1996). Reflective teaching in second language classrooms. Cambridge: Cambridge University Press.

Ryan, R. M., & Deci, E L. (2017). Self-determination theory: Basic psychological needs in motivation, development, and wellness. New York, NY: The Guilford Press.

Sarstedt, M., & Mooi, E. (2014). A concise guide to market research: The process, data, and methods using IBM SPSS statistics (2nd ed.). Berlin & Heidelberg, Germany: Springer-Verlag.

Schiefele, U., Krapp, A., & Winteler, A. (1992). Interest as a predictor of academic achievement: A meta-analysis of research. In K. A. Renninger, S. Hidi & A. Krapp (Eds.), The role of interest in learning and development (pp. 183–211). Hillsdale, NJ: Erlbaum.

Schultz, D., & Schultz, S. E. (2013). Theories of personality (10th ed.). Belmont, CA: Wadsworth, Cengage Learning.

Seeman, H. (2010). Preventing disruptive behavior in colleges: A campus and classroom management handbook for higher education. Plymouth, U.K.: Rowman & Littlefield Education.

Silvia, P. J. (2006). Exploring the psychology of interest. New York: Oxford University Press.

Silvia, P. J. (2012). Curiosity and motivation. In R. M. Ryan (Ed.), The Oxford handbook of human motivation (pp. 157-166). Oxford: Oxford University Press.

Skehan, P. (1991). Individual differences in second language learning. Studies in Second Language Acquisition, 13, 275–298.

Stevens J. P. (2009). Applied multivariate statistics for the social sciences (5th ed.). New York, NY & Hove, U.K.: Routledge.

Tabachnick, B. G., & Fidell, L. S. (2013). Using multivariate statistics (6th ed.). New York, NY: Pearson Education.

Table of Subject Matters and Weekly Teaching Hours for the Second Secondary Education Program (2016, May 30). Retrieved 19 July, 2017 at: http://www.rrk.ir/Laws/ShowLaw.aspx?Code=10373

Telegram Messenger (n.d.). Retrieved 23 June, 2017 at: https://telegram.org

The Comprehensive Directive for Educational Departments (1998). In The Portal for the Office of Senior Secondary Education. Retrieved 23 June, 2017 at: http://10.30.170.46/portal/fileLoader.php?code=03e4f65c9ee8a69c8839c81461407483

Verspoor, M. H., de Bot, K., & Lowie, W. (2011). A Dynamic Approach to Second Language Development: Methods and Techniques. Amsterdam: John Benjamins.

Wigfield, A., Cambria, J., & Eccles, J. S. (2012). Motivation in education. In R. C. Ryan (Ed.), The Oxford handbook of human motivation (pp. 463-478). New York and Oxford: Oxford University Press. https://doi.org/10.1093/oxfordhb/9780195399820. 013.0026