Document Type : Research Article
Authors
Languages and Linguistics Center, Sharif University of Technology, Tehran, Iran
Abstract
Keywords
Main Subjects
Introduction
The challenges of learning English phrasal verbs (PhVs) have been occasionally underscored in the instructed second language acquisition (ISLA) literature (e.g., Ansarin & Javadi, 2024; Armstrong, 2004; Fallah et al., 2024). Yasuda (2010) characterized PhVs as “perennial sources of confusion” (p. 250), while Littlemore and Low (2006) called them a “recurring nightmare” (p. 17). As for their polysemous aspect, Gardner and Davies (2007) concluded that the first 100 most frequent PhVs potentially carry 559 different meanings in total, amounting to 5.6 per item. To complicate matters further for learners of English, native speakers coin new ones every now and then (Armstrong, 2004). PhVs are also characterized by their unpredictability. In other words, the odds are that their meaning is not easily conjecturable from their appearance.
In retrospect, a major distinction has been famously drawn in cognitive psychology between explicit/conscious and implicit/unconscious learning mechanisms. The controversial issue of whether and to what extent the implicit/explicit learning distinction should be reckoned as a continuum or dichotomy has also been extended to the instructional domain. Ellis (2015) argued that to establish the relative contribution of these two types of instruction, it is necessary to demonstrate not only which type is superior in helping learners, but also their relative effect on L2 development in general. Nevertheless, conflicting results about the pedagogical superiority and durability of the two approaches abound in the literature. Several review and meta-analytic studies to date, mostly hinging on grammar acquisition (e.g., DeKeyser & Juffs, 2005; Goo et al., 2015; Norris & Ortega, 2000; Spada & Tomita, 2010), have testified to the more impactful immediate role of explicit over implicit interventions. By the same token, Schmitt (2008), in his narrative review of instructed L2 vocabulary learning, concluded that the effects of explicit instruction outweigh implicit instruction. Still, in a more recent meta-analysis, Kang et al. (2019) found relatively analogous effect sizes regarding the impact of immediate posttests for implicit and explicit interventions, but larger effect sizes for implicit instruction at the delayed posttests.
It is worth noting that studies on implicit intervention vary in terms of their degree of implicitness. Moreover, some experiments have not been sufficiently lengthy or sophisticated, and some have failed to measure the learners' delayed retention. That said, it remains to be seen whether the purported short-term superiority of explicit over implicit instruction will hold true when it comes to more moderate implicit frameworks such as task-based instruction (TBI), which subscribe to some sort of a marriage of incidental exposure and focus-on-form, as opposed to frameworks that preclude incorporating focus-on-form (e.g., extensive reading). Surprisingly, in the realm of PhVs —as a subcategory of lexical forms— with all the afore-said idiosyncrasies, little has been done to juxtapose explicit and implicit intervention procedures. In light of the EFL (English as a foreign language) learners' longstanding struggle with PhVs, this study examined the extent to which the PPP (Presentation, Practice, Production) model, as an established explicit framework, and TBI, as a so-called moderate implicit framework, can facilitate their immediate and delayed recall.
Since the majority of studies to date have merely approached PhVs’ pedagogy through the implementation of offline (i.e., post-testing) methods of measurement, we sought to capitalize on online (i.e., real-time/concurrent) psycholinguistic measures to illuminate the attentional as well as reactional reflexes of the subjects who undergo distinct forms of instruction in order to accumulate more objective data. Furthermore, integrating online psycholinguistic data elicitation techniques with an experimental study could bring more insight into our understanding of the interplay of the two systems of learning. Previous studies on explicit and implicit pedagogies did little to document the underlying perceptual disparities that could occur in the wake of these divergent instructional procedures. Any potential outcome discrepancy can then be subjected to follow-up analyses and investigated in its own right.
Literature Review
Implicit and Explicit Instruction
Implicit instruction refers to the set of pedagogic activities or mechanisms that facilitate incidental learning (see Webb, 2020), designed so as to attract attention to the targeted linguistic forms. One point of note, however, is that unlike pure incidental/implicit learning, implicit instruction may include activities in which students are exposed not only to the communicative content in the first place, but language-focused activities as well (e.g., Nation, 2007; 2023). In effect, what discriminates implicit from explicit instruction is that the former tends to avoid metalinguistic explanation of the target features. Explicit language instruction focuses on intentional learning by overtly intervening in the learners' interlanguage system. Table 1 summarizes the main features of the explicit/implicit instructional dichotomy.
Table 1. Implicit vs. Explicit Instruction (adopted from Housen & Pierrard, 2005, p. 10)
|
Implicit Instruction |
Explicit Instruction |
|
· attracts attention to the target form · is derived spontaneously (e.g., in an otherwise communication-oriented activity) · is unobtrusive (minimal interruption of communication of meaning) · presents target forms in context · makes no use of metalanguage · encourages free use of the target form |
· directs attention to the target form · is predetermined and planned (e.g., as the main focus and goal of a teaching activity) · is obtrusive (interruption of communicative meaning) · presents target forms in isolation · uses metalinguistic terminology (e.g., rule explanation) · involves controlled practice of the target form |
Focusing on the lexical domain, incidental learning occurs as a by-product of meaning-focused tasks (Nation, 2023; Webb, 2020). Even though theoretically differentiated from intentional learning, some degree of intention is argued to exist in incidental learning. Hence, when students are engaged with meaning-focused L2 input, it will be difficult to rule out the existence of certain degrees of consciousness and intention to learn (Webb, 2020; Boers, 2021). In spite of the growing universal significance of learning an additional language, the choice of optimal methods/techniques of instruction and their durability still remain open to question. In the following, PPP and TBI, which are hailed as two outstanding manifestations of explicit and implicit instructional frameworks, will be discussed.
PPP is a classical model of L2 explicit teaching and a dominant approach to teaching linguistic forms, which stands for presentation, practice, and production, representing its three phases (Ellis, 2015). The PPP sequence is somewhat easy to organize and also compatible with other techniques, which can be operationalized for a large group of students. In addition, the model follows clear and tangible objectives, a precise syllabus, and transparent criteria for evaluating its effectiveness. The model involves controlled and free production activities, as seen in Audiolingualism and Communicative Language Teaching. It is proposed that PPP should take place cyclically rather than in a linear style, and along this path, the teacher plays the role of an ‘informant’ in the first phase (presentation), a ‘conductor’ in the second phase (practice), and a ‘guide’ in the third phase (production). Nevertheless, they may revert to any previous stage if need be (Ellis & Shintani, 2014).
TBI has been increasingly gaining ground in many contexts over the past two decades
(Ellis, 2017). Attention to learner needs, simultaneous emphasis on communicative language during the learning process, and establishing a connection between language learning and real life are among its core tenets. Implicit instruction can range from no focus-on-form options, such as extensive viewing and extensive reading (Nunan & Richards, 2015) to form-focused options, such as input-enhancement, which adopt some levels of explicit manipulation
(Ellis, 2015). As reflected in an extensive body of research, in TBI as a moderate type of implicit intervention, learners are invited to focus on meaning more emphatically than form; hence, they engage in a series of communicative feats in addition to doing form-based
(i.e., explicit) activities (Ellis, 2015). Figure 1 illustrates where TBI stands.
Figure 1. A Closer Look at Implicit Instruction (Adopted from Ellis, 2015, p. 272)
According to Ellis (2015), two broad categories have been defined for meaning-centered implicit instruction: The first involves pedagogic frameworks such as 'extensive reading' and Krashen and Terrell's 'Natural Approach' (1983), in which there is technically no focus-on-form element. These frameworks adhere to the notion that L2 acquisition will occur automatically and effortlessly as a result of sustained incremental exposure to comprehensible input. The second type of implicit intervention includes frameworks such as 'enhanced input' and 'TBI'. As a more elaborate and perhaps classroom-friendly type of meaning-focused instruction, TBI invites primary attention to communicative meaning while assigning peripheral or periodic attention to form. In TBI, opportunities for incidental acquisition can be engineered through task selection and implementation. Ellis and Shintani (2014) compared the pedagogic features of TBI and PPP (see Table 2).
Table 2. Types of Instruction, Learning, and Methods
|
|
Syllabus |
Type of instruction |
Type of activities |
Type of learning |
|
PPP |
Linguistic |
PBI (production-based) |
Focused |
Intentional |
|
TBI |
Task-based |
CBI or PBI (comprehension-based or production-based) |
Primarily unfocused |
Incidental |
(Adopted from Ellis & Shintani, 2014, p. 118)
Eye Tracking and Reaction Time
Several methodologies have been developed over the past few decades to delve into the cognitive processes underlying language acquisition. One method that has more recently gained traction is eye-movement recording, also known as eye-tracking. Eye-tracking is a suitable procedure to appraise focal and peripheral learner attention, which can have widespread applicability in (I)SLA research (e.g., Godfroid & Hui, 2025; Pellicer-Sánchez, 2020). Meanwhile, measuring and analyzing time can help researchers access more detailed information on the mental underpinnings of performing a task (Jiang, 2012). As scholars are interested in learning how the human mind functions, they aspire to find out how fast people perform a task or respond to a stimulus. Hence, an elicitation technique known as Reaction Time (RT) has been advanced. RT is often implemented in conjunction with other research techniques such as the Moving Window or eye-tracking. Custom-made and commercially available programs are outlets through which the time span of responses can be measured.
To the best of the researchers' knowledge, no substantive attempt has been made to date to measure the comparative effects of classroom-situated implicit and explicit instructional frameworks on PhVs acquisition. Such a study may help to inform strategies toward PhVs’ pedagogy. Hence, this study aspired to test out the effectiveness of implicit versus explicit approaches to teaching PhVs and gauge the learners' ensuing attentional and reactional behaviors in an attempt to contribute to ISLA and the domain of cognitive psychology. In so doing, we proposed the following three research questions:
Methods
Participants and Setting
The main phase of this study included a quasi-experimental pretest-posttest design where
62 students from three intact classes (i.e., undergraduate general English courses) at Sharif University of Technology in Iran attended the intervention. They were EFL learners with the same native language (i.e., Farsi) majoring in various engineering courses. Their ages ranged from 19 to 22 (M = 19.92). Each class was randomly assigned to a condition, namely explicit (PPP), implicit (TBI), and control (see Table 3). The focus of the curriculum in these classes, as mandated by the university's policy, was primarily premised on enhancing the students' academic reading skills and vocabulary competence. In all general English classes, an identical textbook (Inside Reading, by Oxford University Press, 2nd edition, 2013) was used along with other supplementary materials and worksheets surrounding vocabulary and speaking activities.
Table 3. Participants' Demographic Information
|
|
Explicit condition |
Implicit condition |
Control condition |
|
OPT1 Mean (out of 60) |
34.5 |
35.5 |
33.4 |
|
Proficiency level |
B1: Intermediate |
B1: Intermediate |
B1: Intermediate |
|
Number of students |
23 (5F, 18M) |
21 (8F, 13M) |
18 (5F, 13M)2 |
1 OPT=Oxford Placement Test
2 F=Female, M=Male
Instruments and Materials
For both the PPP and TBI groups, two separate detailed lesson plans were devised. The instructional materials, including the target PhVs and the related example sentences and definitions, were adopted from McCarthy and O’Dell (2017), as well as other authentic online sources. The target PhVs we selected were more inclined towards the formal register, and their definitions were not easy to speculate. The experiment also involved a placement test, a VKS pretest, and three posttests. Furthermore, to prevent a washback effect, students were informed that the test results would not negatively affect their final semester scores.
The Offline Instruments
Initially, a 60-item mock Oxford Placement Test (OPT) was administered to assess the students' general language proficiency, demonstrating that a majority of the scores were equivalent to the CEFR's B1 level. Outliers were excluded from the analyses. The OPT was followed by the 31-item VKS test to ensure their lack of familiarity with the target PhVs. VKS is a scalar self-report battery that ranges from one (no knowledge) to five (the ability to produce a semantically and syntactically accurate form of the target vocabulary in a sentence).
To minimize the possibility of subsequent recall, some non-PhV items were included in between as distractors, for which no score was allocated during rating. Moreover, we particularly needed to keep the participants of the implicit group in the dark about the content of the instruction.
Three rounds of posttests were administered in this study: immediate, delayed I, and delayed II (Appendix 1). The immediate posttests were given shortly after each instructional period at the end of each class, and the first delayed posttest was administered in a week's time. Delayed posttest II was administered one month after completing the treatments (see Table 4). The test items were intended to measure the learners' recall of the target PhVs. All pre- and posttest items were developed on the basis of the content material taught during the treatment scheme. To establish response time consistency across all posttests, one minute was allocated for each item in the test for all participants. To ensure the internal consistency of the items, Cronbach’s alpha was computed. The total reliability indices were at α=.70 (immediate posttest), α=.68 (delayed posttest I), and α=.62 (delayed posttest II). For considerations of validity, two experts in the domain of applied linguistics were consulted, who examined the items closely and, upon some revisions, confirmed their appropriateness.
Table 4. Intervention Chronology
|
Session |
Activity |
|
1 |
Briefing, VKS test as pre-test, teaching 5 PhVs and testing 3. (Also serving as a pilot session) |
|
2 |
Teaching 8 PhVs / testing 5 |
|
3 |
Teaching 14 PhVs / testing 7 |
|
4 |
Teaching 15 PhVs / testing 8 |
|
5 |
Teaching 10 PhVs / testing 5 |
The Online Instruments
The eye-tracking device used in this study was a high-speed Tobii Pro, which can capture gaze data at 1200 Hz. With a 24-inch monitor, the desk-mounted eye tracker can tolerate head movement, providing natural and real-world data. In addition to hardware, the system is equipped with a piece of Windows-based complementary software called Tobii Studio 3.4.8 for designing tasks, recording data, and exporting the intended output in various forms.
In order to measure RT, an Android-based application named Easy Exam Client was developed by the researchers with the aid of an additional specialist. In fact, the dearth of sufficient literature on RT made us take the initiative and develop an application that would suit our purposes. With a volume of approximately 3MB, the app was developed through JAVA using Android Studio 3.2. On its first page, participants wrote down their particulars. Then, moving to the next page, a brief definition of PhVs and some supplementary information were provided. The application was piloted with five students who completed the test in approximately five minutes. Nevertheless, to reduce pressure on the learners, ten minutes were allotted to the sheet. Also, the students were told that the software would not be shut down after the specified timeline. During the test, the participants were not allowed to go back to the previous questions. After answering 15 four-option items and touching a See Result button, the answer sheet output for each participant appeared and was simultaneously saved on the server. In the Result page, in front of each question, the accuracy rate (true or false) and the timeframe (in milliseconds) of each response could be viewed (see Appendix 3).
Procedure
The experiment began with introductory briefing sessions in which an outline of the entire research project was brought to the students. Then, the students were given consent forms to sign prior to the intervention procedure. Importantly, they were informed about the extra credit (one additional point) for participating in the project and their right to quit from the experiment in case of any disapproval. In the following, details of the offline and online treatments are elaborated.
The Offline Phase
After devising lesson plans for the experimental conditions (PPP & TBI), the researchers allocated one session to piloting, which included sample instruction and a posttest in a single session. A few weeks later, the treatment package began, in which the instructional frameworks were implemented. At the end of each session, an immediate posttest based on the instructed material was administered. Likewise, the delayed posttests were handed out at the outset of the following sessions (in a week's time). Out of a total of 52 PhVs taught during the entire treatment phase, over 50% (= 28) were subjected to testing. Importantly, the rest of the practiced items in the lessons were used as distractor items in the posttests (see Table 4 for more details). The instructional materials were given to the students on handouts. These handouts were collected by the end of the period and shortly before the posttest so that the students would not be able to practice the items outside the classroom. During each treatment session, the instructor carried out the tasks and activities on the basis of the established theoretical tenets of the respective approach (PPP versus TBI). In the former condition, PhVs were defined explicitly from the outset, with an emphasis on their definitions and usage. Then, various controlled practice activities characterized by the PPP framework were conducted.
In the latter condition, the students were taught through exposure to numerous example sentences produced by the teacher. Use of metalanguage or explicit explanation of the nature of the target items was avoided on the part of the teacher. Students were not allowed to use the dictionary; rather, they relied on negotiation of meaning and resorted to the instructor for help. The instructor encouraged discovery learning and prompted students to draw on their inferencing skills to decipher the meaning of the PhVs. As such, a concerted effort was made to provide recasts where necessary, rather than explicit feedback (see Tables 5 & 6 for the detailed lesson plans and classroom activities).
The treatment phase went on for five 50-minute sessions for both experimental conditions (see Appendix 2a & 2b for classroom materials). A third control group was also included in our project. Unlike the experimental conditions, the control students were engaged in their own regular curricular activities involving reading-based general English lessons. The reading activities consisted of passages and exercises that resembled preparation activities for high-stakes English examinations such as TOEFL, and the techniques/strategies needed to be mastered for that purpose. Whereas the control subjects were not exposed to any pre-planned instructional methods involving PhVs, they were given the same tests as the experimental groups. The last delayed posttest was administered nearly four weeks after the completion of the treatment packages. The whole treatment period lasted for nearly 40 days. Each group experienced one period of instruction each week.
Table 5. Lesson Plan for the TBI Condition
|
comment |
Students' role |
Teacher's role |
Duration |
|
|
Pre-task aimed at providing input, motivating learners and covertly presenting the topic of the task. |
Students were advised to pay careful attention in order to grasp the message conveyed by the teacher. They were told to prepare themselves for active participation in group work and for monitoring themselves. |
Warm up (pre-task): Teacher used gestures, mimes, and example-based sentences to contextualize the target items. In doing so, he had to plan in advance how he could convey the meaning of the target items such that meaning-based focus would precede attention to form. |
10 (min) |
1 |
|
Sample tasks included: • Doing role plays/ Job interview/Information gap tasks/ Problem solving tasks, Jigsaw tasks • Matching pics with related PhVs • Innovating conversations using the target PhVs • Writing emails • Creating stories |
Students carried out the task in pairs or small groups of three or four and planned on how to report on the task outcome. They had to report back to the whole class. Every group member was supposed to be prepared to report. They were not allowed to check the new words in a bilingual dictionary. |
Task cycle: This phase consisted of planning and conducting the task and reporting the results. The teacher monitored the students, while encouraging them to make sense of the presented target words through negotiation and not to worry about mistakes. |
20 (min) |
2 |
|
Students discussed and negotiated the topics, their problems and errors. Some language-focused tasks were also administered. Errors were treated mostly through the provision of recasts in order not to attract attention. |
Students paid close attention and took some notes. They were required to analyze themselves and adjust their shortcomings. |
Language focus and feedback (post-task): The teacher verbalized and reviewed some marked errors of the students and summarized the problematic structures (including the PhVs) elicited during the task cycle. |
20 (min) |
3 |
Table 6. Lesson Plan for the PPP Condition
|
comment |
Students' role |
Teacher's role |
time |
|
|
All items and structures were directly explained, which was intended to help students remember the items they already knew and to activate their schemata. Metalanguage was frequently employed. |
Students were told to listen carefully in order to grasp the message and remember the related vocabulary and/or learn new ones. |
Presentation: Teacher explained the definitions and usage of the PhVs and the other challenging words explicitly. He controlled the class and used texts or visual aids to overtly demonstrate the target situations. |
10 (min) |
1 |
|
Activities ranged from substitution drills and multiple-choice exercises, to sentence-matching and gap-and-cue exercises. |
Students were to repeat and practice new PhVs in different sentences under teacher’s supervision. |
Practice: The teacher used and monitored activities (often controlled) such as drills in order to model the correct forms and provide positive evidence/feedback. |
20 (min) |
2 |
|
Elicited activities included oral presentations and controlled discussions in the form of sentences and/or longer texts. |
Students designed and carried out role-plays the way they desired, without being under pressure to use a particular linguistic construction. They used the newly learnt linguistic input in order to produce oral or written texts while engaged in activities. |
Production: The teacher did his best to manage the processing of activities and help students to perceive the materials accurately. Where necessary, body language was also used to elicit the production of the target items. |
20 (min) |
3 |
The Online Phase (Eye-tracking)
One day after completion of the last treatment session, participants (i.e., who had consented earlier to volunteer for the occasion) were scheduled to partake in the eye-tracking experiment. In total, 32 students (18 from the explicit group and 13 from the implicit group) attended the eye-tracking lab, where each individual was briefed with the necessary instructions. After recording all responses, four areas of interest (AoI) were designated, including the question stem (Q), options (O), the correct option (C), and the whole question (W) (see Figure 2).
Figure 2. Areas of Interest for a Sample Item
To examine the participants across the two conditions further from a psycholinguistic lens, seven eye movement measures were taken into account: time to first fixation, first fixation duration, total fixation duration, fixation count, visit duration, total visit duration, and visit count. We were particularly interested to ascertain whether the individuals who had undergone an explicit mode of instruction would display any significantly different processing behavior from their counterparts in the implicit condition. To this end, we applied a whole range of
eye-tracking metrics in order to record exhaustive sets of data (For the definitions of said metrics, see, e.g., Montero Perez, 2019). Figures 3, 4, 5, and 6 further illustrate the metrics and stages associated with the implementation of the eye-tracking phase.
Figure 3. Eye-tracking Processes
Figure 4. Eye-tracking Experiment in Progress
Figure 5. Sample Graphical Output Representing a Heat-map
Figure 6. Sample Graphical Output of the Eye-tracker Representing the Gaze-sequence
Results
As for the first research question, repeated-measures ANOVA and multivariate analysis of variance (MANOVA) tests were used to identify the effectiveness of each method of instruction on the short- and long-term recall of PhVs. To assess the psycholinguistic consequences of instruction on the learners' eye behavior (Research Question 2) and RT (Research Question 3), a series of Mann-Whitney U tests and independent-samples t-tests were conducted, respectively. To explore the homogeneity of scores in terms of language proficiency (i.e., OPT), a one-way ANOVA was run, the results of which revealed no significant difference between the three groups at the .05 level: F (2, 51) =1.28, p = .28; TBI (M = 35.50, SD = 9.85), PPP (M = 35.18, SD = 8.53), and Control (M = 33.40, SD = 7.96). With regard to the pretest (i.e., VKS), based on another one-way ANOVA, no statistical difference was found at the
.05 level: F (2, 52) = .26, p = .76; TBI (M = 41.06, SD = 3.960), PPP (M = 42.91, SD = 5.77), and Control (M = 44.07, SD = 6.18). Hence, we were able to postulate that the group homogeneity factor was not violated. Prior to analyzing the MANOVA outcome, the data were checked with regard to normality (e.g., Kolmogorov-Smirnov=.09, p=.20, for the first delayed posttest; and Kolmogorov-Smirnov=.11, p=.15, for the second delayed posttest), equality of covariance matrices (Box's M= 8.55, p=.80), and linearity to see whether they satisfied the assumptions. Table 7 portrays how the groups performed on the posttests.
Table 7. Descriptive Statistics
|
Posttest |
Group |
Mean |
SD |
N |
|
Immediate |
PPP |
17.26* |
2.82 |
23 |
|
TBI |
16.44 |
2.64 |
18 |
|
|
control |
8.53 |
2.90 |
15 |
|
|
Delayed I |
PPP |
15.60 |
3.44 |
23 |
|
TBI |
15.50 |
3.91 |
18 |
|
|
control |
9.09 |
5.55 |
15 |
|
|
Delayed II |
PPP |
13.45 |
3.88 |
23 |
|
TBI |
12.89 |
3.88 |
18 |
|
|
control |
9.62 |
5.34 |
15 |
*Maximum score = 20
From a within-groups perspective, a one-way repeated measures ANOVA test revealed that the PPP group which gained the highest mean scores at the immediate posttest significantly declined at the first delayed and then again at the second delayed posttests; Wilks’ Lambda = .39, F = 16.58, p < .001; ηp2 =.61. The ANOVA test, however, located no difference in the TBI group between the immediate and the first delayed posttests. Yet, retention plummeted significantly from the first to the second delayed posttest; Wilks’ Lambda = .30, F = 18.76, p < .001; ηp2 =.70. As for the control group, no significant effect for time was found between the three rounds, Wilks’ Lambda = .70, F = 2.84, p = .09.
Regarding the between-groups findings, the immediate recall test revealed a statistical difference between the experimental conditions and the control condition with a very large overall effect size (F = 44.26, p < .001; ηp2 =.65) (Cohen, 1988). However, no such difference was found between the two experimental conditions. By the same token, the results of the one-week delayed recall test revealed a similar pattern (F = 3.88, p < .05; ηp2 =.11). However, the effect size value was not as large as the immediate posttest. The results for the one-month follow-up test revealed a statistical difference (F = 14.09, p < .001; ηp2 =.37) between the scores of the PPP group and the control, but not between the TBI and the control (p > .1) groups. Based on the findings, both the PPP and TBI groups outperformed the control group on the immediate and the first delayed posttests, Wilks’s Lambda = .291, F = 69.81, p < .001, partial eta squared = .46. Considering the results of the one-month delayed test, both conditions experienced a drop in scores; however, only the explicit condition managed to sustain significant gains over the control condition (p = .03). Figure 7 illustrates how the three groups functioned across the three posttests.
Figure 7. Group Performances Across the Three Posttests
To probe the second research question, a series of Mann-Whitney U tests was run, as the data distribution at this stage was not sufficiently normal. The output revealed that there were no significant differences between the experimental conditions at the .05 level in terms of any of the specified metrics, suggesting that the two groups' eye-movement behavior was technically analogous (see Appendix 4).
The third research question inquired whether the implicit (TBI) and explicit (PPP) instructional treatments led to varying RT behavior on the part of the learners. To address this inquiry, a series of independent samples t-tests were run, upon checking the data normality assumption (Kolmogorov-Smirnov=.139, p=.12, for the True Answers Time; Kolmogorov-Smirnov=.101, p=.20, for the False Answers Time; and Kolmogorov-Smirnov=.102, p=.20, for the Total Answers Time). The findings revealed that in terms of the three metrics, there was no significant difference in scores between the TBI and PPP groups (see Table 8).
Table 8. RT Descriptive and Inferential Statistics for the Experimental Conditions
|
|
Group |
N |
Mean (Milliseconds) |
SD |
Z-score (Mean) |
t |
Sig. (2-tailed) |
|
App True Time |
Explicit |
17 |
132036.12 |
45417.62 |
.09 |
.70 |
.49 |
|
Implicit |
13 |
119652.92 |
50726.83 |
-.17 |
|
|
|
|
App False Time |
Explicit |
17 |
72725.00 |
47822.77 |
-.01 |
1.20 |
.24 |
|
Implicit |
13 |
52898.46 |
40846.65 |
-.36 |
|
|
|
|
App Total Time |
Explicit |
17 |
205095.94 |
63784.41 |
.05 |
1.45 |
.16 |
|
Implicit |
13 |
172551.38 |
57382.29 |
-.39 |
|
|
Discussion
The main goals of the study were to compare the effectiveness of explicit versus implicit instructional frameworks as well as to ascertain whether the experimental conditions would undergo divergent attentional and reactional experiences. The obtained between-groups results suggested that both PPP and TBI significantly improved the students’ recall of the target PhVs. This improvement persisted for at least one week, as evidenced by the first delayed posttest. Yet, in the second delayed posttest, the explicit group fared better than the control condition, but the implicit group failed to attain such superiority. As a matter of fact, on this posttest, which measured the durability of the effects of instruction, only the explicit condition outperformed the control group. Regarding within-group performances, the PPP group displayed high immediate gains, but failed to maintain its tempo in the medium (one-week) and longer terms (one-month follow-up). The TBI group, however, exhibited a more sustainable performance at the second posttest. Yet, at the final posttest, the outcome was less promising.
The overall within-group analysis suggests that more explicit treatments of PhVs
(i.e., PPP) may pay more immediate dividends, whereas less explicit treatments (i.e., TBI) may induce higher degrees of retention (at least for a few more days). Meanwhile, the eye-tracking and RT experiments were largely consistent with the offline findings, as no short-term difference was discerned between the two groups. In the eye-tracking phase, we were expecting to see whether the selected members from the two experimental groups varied in terms of the measured fixation counts and visit durations as a result of dissimilar instructional treatments. Specifically, four AoI were chosen to demonstrate whether the two treatment types had any bearing on the participants' eye movements in terms of the seven specified metrics. Based on the literature (e.g., Ellis & Shintani, 2014), implicit instruction facilitates spontaneity and free use of the target forms. Thus, we assumed the TBI group might have recorded shorter eye fixations and visits while answering the questions. As such, any hypothetical advantage in favor of the TBI group would have implied that less explicit and contextualized focus on PhVs helped students arrive at the correct responses through shorter fixations, and that more implicit practices would lessen the processing time and the duration of visits on the distractor items. On the contrary, had the explicit group proved superior, this could have implied that the explicit-dominated activities embedded in the PPP framework would necessitate or lead to shorter processing time. Of course, none of these scenarios proved to be the case.
In retrospect, regardless of the type of instruction, most (quasi)experimental studies of PhVs to this day have documented some sort of a transient impact. However, a major attribute of the current study was to assess the durability of the interventions; a procedure neglected in many studies on instructional productivity (also see Yasuda, 2010). The delayed posttests were administered in line with Ellis and Shintani's (2014, p. 103) contention that “if explicit instruction is to be truly useful, its effect must be durable.” Our findings, particularly those obtained in the second delayed posttest, tend to support Norris and Ortega's (2000) conclusion. In their synthetic analysis of 49 unique experimental studies published over 18 years, support was gathered for the superiority of explicit over implicit instruction. The conclusion was reaffirmed in Spada and Tomita’s (2010) meta-analysis, where they found explicit instruction to be more effective for both simple and complex features.
The delayed results in our experiment provide additional empirical support, in part, for the more facilitative durable effects of establishing awareness, typical of PPP tenets, for adult learners from the outset of intervention (also see Leow, 2015). The findings here seem to go against what Kang et al. (2019) observed in their meta-analysis, where implicit instruction was found to produce greater effect sizes at the delayed posttests. In addition, we should note that such a favorable outcome for explicit techniques may not be expected to occur for young L2 learners. In a comparable study, for example, Shintani (2013) compared PPP and TBI in terms of the acquisition of vocabulary by young beginning-level learners. In spite of demonstrating the effectiveness of both approaches, she found TBI to fare better for young learners, which could be most ideally operationalized in terms of comprehension-based rather than production-based tasks.
Our eye-tracking and RT experiments affirmed that there may be no substantial difference between the outcomes of implicit and explicit pedagogies, at least in the short and medium terms. The implicit group's longer-term atrophy in remembering the PhVs could arise from a number of reasons, which were also previously hypothesized by Laufer (2005); first, understanding the overall message of the text often hampers individuals from attending to unfamiliar words. Second, the dearth of direct attention and sufficient engagement with the new words may take their toll on retention (also see Godfroid & Hui, 2025).
Meanwhile, implicit instruction has been attributed to several theoretical positions, including focus-on-form (Long, 1991) and the Dual Mode Model (Skehan, 1998). The former points to the synergy or pedagogic balance between meaning and form presentation, while the latter is grounded on the assumption that learners experience difficulty when attending to form and meaning simultaneously due to their limited processing capacity. In this model, it is proposed that sustained exposure to meaning will activate learners' exemplar-based system while attention to form makes for accessing the rule-based system (also see Ellis, 2015).
On balance, for PhVs’ instruction, designing a hybrid model of pedagogy, composed of both explicit and implicit procedures, seems most helpful and conducive to balanced development. Ellis (2017) subscribes to a “modular curriculum” which allows for combining TBI and PPP. This alternative approach consists of unconnected modules within a course. Nevertheless, there are differences between the modular curriculum and the hybrid approach brought up earlier. First, the former approach builds on the weak interface position
(Ellis, 2017), whereas the latter is premised on the strong interface position. Second, in the former approach, modules of teaching are separated and unconnected (Ellis, 2017), while in the hybrid approach, the teacher can switch between implicit and explicit options, even during a single instructional sequence (as in a hybrid car!). Our findings corroborate the usefulness of both incidental and intentional pedagogic procedures. According to Webb (2020), there is value in both methods, and they should not be viewed as being in competition. Specifically,
Boers (2021) enumerated eight factors as major determinants of incidental acquisition of vocabulary, which ought to be taken into consideration as a way of narrowing down the scope of scrutiny in future research.
Considering the outcome of the present eye-tracking experiment, explicit and implicit pedagogies did not register any dissimilar eye-movement behavior. This finding further reveals that implicit and explicit pedagogies may not always have dramatic discrepancies in terms of attentional representations. Here, the findings do not tend to accord with those of Indrarathne and Kormos (2017), who investigated L2 Sri Lankan learners’ attentional processing of a target syntactic construction in written L2 input across implicit and explicit conditions through the lens of eye-tracking. In their study, the learners' eye behavioral patterns in the two explicit frameworks yielded higher gains than the implicit groups who received input flood and input enhancement treatments. Indrarathne and Kormos found explicit knowledge significantly outstripping implicit knowledge in terms of total fixation duration. Nevertheless, they did not administer any delayed tests to measure the long-term effects of the intervention.
On analyzing the data of the RT experiment, no statistical difference was discerned between explicit and implicit conditions in terms of the metrics applied. However, a systematic advantage (i.e., signs of agility) for the implicit group was evident, as the learners recorded less time in all three metrics. Had the sample been larger, the difference might have reached statistical significance. The findings here were, in part, compatible with Blais and Gonnerman (2013), who explored the performance of non-native English speakers through implicit and explicit measures of PhVs. Using the RT method, among others, they concluded that the properties of PhVs can be both implicitly and explicitly internalized by L2 speakers whose first language lacks the construction.
Conclusions
On the basis of the findings, a number of pedagogical implications can be inferred. Foremost amongst them concerns the realm of curriculum development. As hinted earlier, the lack of an organized curriculum in PhVs’ pedagogy has caused considerable confusion for teachers and learners alike. Undoubtedly, the objectives of the course play a key role. Yet, if the students are supposed to retain PhVs over the long term with an appropriate level of automaticity, a compound syllabus consisting of both implicit (e.g., TBI) and explicit procedures (e.g., PPP) is recommended; perhaps a syllabus drawing on explicit instruction and punctuated by communicative tasks. To avoid arbitrariness in teaching PhVs, curriculum developers, material writers, and L2 practitioners are expected to act in concert, as coordination between research and practice is a matter of great urgency in this regard. Furthermore, the study clarified that for L2 learners, unconscious and implicit transfer of knowledge can be beneficial, but for a more durable impact, conscious awareness should additionally be put on the agenda for adult learners. The issue sounds more substantive in those EFL contexts where learners may have less frequent encounters with PhVs and are often exposed to single-word verbs during their conventional experiences at typical language learning centers. We also strove to address
Ellis' (2017) call for conducting task-based lessons in their entirety, as most TBI studies to date have largely failed to tap the full impact of task phases (pre-task, main task, and post-task) in one study.
Another intricate consideration was put forward by Ellis (2015) on the variability and volatility of such theoretically-driven instructional frameworks. Both explicit and implicit varieties of instruction can take many different forms and have thus far met with various degrees of success/failure. It is thus one major responsibility of experimental classroom research in ISLA to appraise the pedagogic gains that accrue from different versions of explicit and implicit instruction, for, any enlightenment in this respect can open up new and wide avenues for the vast community of L2 practitioners across the board who are looking up to the findings of empirical research to resolve their ambiguities for decision-making (also see Nation, 2023). Therefore, one caveat, according to Boers (2021), is that the efficiency of explicit/implicit pathways for pedagogy is contingent upon sound practices and the quality of implementation. We should notice that, on a one-to-one comparison, the PPP group did not directly outperform the TBI, but the control group. Thus, the resulting differences should be properly understood and reported in future research.
This study was subject to several limitations. Although Norris and Ortega (2000) considered treatments of over three hours as an acceptable norm, given the widespread occurrence of PhVs, lengthier treatment sessions can provide more solid results. This could additionally influence the possible impact of pedagogy in the long term. As hinted earlier, the aim of implicit instruction is not only exposing learners to the target forms, but also attracting their peripheral attention to those forms. The focal attention, however, is placed on communicative meaning and context of use (Ellis, 2015).
Furthermore, the classroom time constraints during the instructional experiments and the second delayed posttest did not allow the incorporation of production test items, which normally take more time to complete. Hence, the outcome of the study could only account for the receptive, rather than the productive ability of PhVs. Also, time limitation was quite an issue for managing and conducting various phases of this study. Since bringing students together for the RT experiment after the end of the semester was difficult, we applied a customized mobile app, which helped provide our participants with anytime-anyplace access. Hence, we avoided the available software, which often lacks such flexibility. One limitation of RT use is that, in spite of producing helpful concurrent data surrounding the notions of automaticity and speeded-up performance, it does little to measure awareness as a pivotal concept in cognitive psychology (see Leow, 2015).
Overall, in this study, we selected the target PhVs without any systematic sorting; however, future research can exploit classified items based on their semantic transparency or collocational properties. To conclude, PhVs’ pedagogy, as challenging as it is for many
L2 learners, merits investigation through more innovative means.
Declaration of Conflicting Interests
The authors declare no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Appendices
Appendix 1: A sample posttest
Appendix 2(a): Classroom material for a sample PPP lesson
Part 1: Presentation

Part 2: Practice
1- She held forth on a variety of subjects all through lunch.
2- He heard a voice behind him and swung around in surprise.
3- She phoned him to ask him out for a drink.
4- You’ll have to speed up – they’re gaining on us!
5- A police officer flagged the car down.
6- The news has wiped nearly a third off the value of the company’s shares.
7- There was a growing concern that the war might spill over into the other countries.
8- His latest novel, already a bestseller, was dashed off in under three weeks.
9- She went through absolute hell during her divorce.
10- Respective governments have papered over many of the disagreements between the two countries.
11- Just before he left to board his plane she suddenly blurted out, ‘I love you’
12- They had failed in their attempt to head off the incoming missiles.
13- Can you flesh out your report with some figures?
14- They mocked up the newly designed car so that its shape could be studied in more detail.
15- She took a gulp of milk to wash the pills down.
Part 3: Production
Write down an email to your friend and explain a bad experience which you’ve had in a restaurant last night. Use the phrasal verbs from part 1 in your email. Then, read it for the class.
Appendix 2(b): Classroom material for a sample TBLT lesson
Part 1: Pre-task
Contextualization
1- She held forth on a variety of subjects all through lunch.
2- He heard a voice behind him and swung around in surprise.
3- She phoned him to ask him out for a drink.
4- You’ll have to speed up – they’re gaining on us!
5- A police officer flagged the car down.
6- The news has wiped nearly a third off the value of the company’s shares.
7- There was a growing concern that the war might spill over into the other countries.
8- His latest novel, already a bestseller, was dashed off in under three weeks.
9- She went through absolute hell during her divorce.
10- Respective governments have papered over many of the disagreements between the two countries.
11- Just before he left to board his plane she suddenly blurted out, ‘I love you’
12- They had failed in their attempt to head off the incoming missiles.
13- Can you flesh out your report with some figures?
14- They mocked up the newly designed car so that its shape could be studied in more detail.
15- She took a gulp of milk to wash the pills down.
Part 2: Main Task
Work in pairs and write down an imaginary conversation between a president and a journalist. Discuss the current political and economic issues and other important topics such as the upcoming election, using the phrasal verbs from Part 1. Then act out the dialog in front of the class.
Student A (the journalist) should try and ask challenging questions!
Student B (the president) should give smart answers!
Part 3: Language focus and feedback
Definition: A Phrasal verb is a verb that is made up of a main (proper) verb and a particle. Typically, their meaning is not obvious from the meaning of individual words themselves.
Appendix 3: Screenshots of the RT app
Appendix 4: Eye-tracking descriptive and inferential statistics
|
Eye-tracking metrics in each AoI |
Group |
N |
Mean (in seconds) |
SD |
Mann-Whitney U Value |
Z-score |
Sig. (2-tailed) |
|
Time of First Fixation |
explicit |
18 13 |
5.52 4.90 |
2.014 1.48 |
99.00 |
-.72 |
.47 |
|
implicit |
|||||||
|
Time of First Fixation |
explicit |
18 |
1.80 |
1.55 |
117.00 |
.00 |
1.00 |
|
implicit |
13 |
1.63 |
1.14 |
|
|
|
|
|
Time of First Fixation |
explicit |
18 |
.48 |
.50 |
74.00 |
-1.72 |
.09 |
|
implicit |
13 |
.33 |
.29 |
|
|
|
|
|
Time of First Fixation |
explicit |
18 |
.12 |
.37 |
66.50 |
-2.02 |
.05 |
|
implicit |
13 |
.02 |
.025 |
|
|
|
|
|
First Fixation Duration |
explicit |
18 |
.31 |
.09 |
110.00 |
-.28 |
.78 |
|
implicit |
13 |
.30 |
.07 |
|
|
|
|
|
First Fixation Duration O. tot |
explicit |
18 |
.23 |
.06 |
107.00 |
-.40 |
.69 |
|
implicit |
13 |
.22 |
.04 |
|
|
|
|
|
First Fixation Duration Q. tot |
explicit |
18 |
.21 |
.04 |
68.50 |
-1.94 |
.05 |
|
implicit |
13 |
.23 |
.03 |
|
|
|
|
|
First Fixation Duration W. tot |
explicit |
18 |
.20 |
.04 |
108.00 |
-.36 |
.72 |
|
implicit |
13 |
.20 |
.02 |
|
|
|
|
|
Total Fixation Duration C. tot |
explicit |
18 |
1.81 |
.82 |
85.00 |
-1.28 |
.20 |
|
implicit |
13 |
1.57 |
.46 |
|
|
|
|
|
Total Fixation Duration O. tot |
explicit |
18 |
5.47 |
2.27 |
90.00 |
-1.08 |
.28 |
|
implicit |
13 |
4.93 |
1.41 |
|
|
|
|
|
Total Fixation Duration Q. tot |
explicit |
18 |
7.11 |
2.75 |
70.00 |
-1.88 |
.06 |
|
implicit |
13 |
5.80 |
1.29 |
|
|
|
|
|
Total Fixation Duration W. tot |
explicit |
18 |
13.45 |
4.78 |
82.00 |
-1.40 |
.16 |
|
implicit |
13 |
12.04 |
2.26 |
|
|
|
|
|
Fixation Count C. tot |
explicit |
18 |
5.17 |
2.03 |
112.50 |
-.18 |
.86 |
|
implicit |
13 |
5.02 |
1.32 |
|
|
|
|
|
Fixation Count O. tot |
explicit |
18 |
15.93 |
5.72 |
107.00 |
-.40 |
.69 |
|
implicit |
13 |
15.63 |
4.22 |
|
|
|
|
|
Fixation Count Q. tot |
explicit |
18 |
29.49 |
10.79 |
84.00 |
-1.32 |
.19 |
|
implicit |
13 |
25.33 |
6.31 |
|
|
|
|
|
Fixation Count W. tot |
explicit |
18 |
48.68 |
15.17 |
101.00 |
-.64 |
.52 |
|
implicit |
13 |
45.68 |
8.24 |
|
|
|
|
|
Visit Duration C. tot |
explicit |
18 |
.60 |
.18 |
93.00 |
-.96 |
.34 |
|
implicit |
13 |
.55 |
.19 |
|
|
|
|
|
Visit Duration O. tot |
explicit |
18 |
1.35 |
.37 |
94.00 |
-.92 |
.36 |
|
implicit |
13 |
1.26 |
.45 |
|
|
|
|
|
Visit Duration Q. tot |
explicit |
18 |
2.38 |
.77 |
77.00 |
-1.60 |
.11 |
|
implicit |
13 |
1.90 |
.49 |
|
|
|
|
|
Visit Duration W. tot |
explicit |
18 |
14.17 |
4.7 |
84.00 |
-1.32 |
.19 |
|
implicit |
13 |
12.59 |
2.51 |
|
|
|
|
|
Total Visit Duration |
explicit |
18 |
1.89 |
.83 |
88.00 |
-1.16 |
.25 |
|
implicit |
13 |
1.63 |
.47 |
|
|
|
|
|
Total Visit Duration O. tot |
explicit |
18 |
5.89 |
2.26 |
92.00 |
-1.00 |
.32 |
|
implicit |
13 |
5.31 |
1.56 |
|
|
|
|
|
Total Visit Duration |
explicit |
18 |
8.10 |
2.95 |
74.00 |
-1.72 |
.08 |
|
implicit |
13 |
6.69 |
1.65 |
|
|
|
|
|
Total Visit Duration |
explicit |
18 |
15.76 |
4.53 |
84.00 |
-1.32 |
.19 |
|
implicit |
13 |
13.97 |
2.80 |
|
|
|
|
|
Visit Count C. tot |
explicit |
18 |
3.29 |
1.02 |
101.00 |
-.64 |
.52 |
|
implicit |
13 |
3.12 |
.73 |
|
|
|
|
|
Visit Count O. tot |
explicit |
18 |
4.74 |
1.26 |
107.00 |
-.40 |
.69 |
|
implicit |
13 |
4.61 |
1.19 |
|
|
|
|
|
Visit Count Q. tot |
explicit |
18 |
3.84 |
.84 |
110.00 |
-.28 |
.78 |
|
implicit |
13 |
4.06 |
1.06 |
|
|
|
|
|
Visit Count W. tot |
explicit |
18 |
1.38 |
.92 |
108.00 |
-.37 |
.71 |
|
implicit |
13 |
1.20 |
.20 |
|
|
|
* Total