Legal psychology is field of research which seeks to bring evidence-based practice to the vital work of the criminal justice system. This research is increasingly being conducted and its findings applied around the world. However, worldwide legal systems and their processes can vary greatly. In the current paper, we summarise discussions between legal psychology academics and criminal justice practitioners from Japan. Together, we examine how practices in the Japanese criminal justice system interact with the international evidence base for psychologically-informed ‘best practice’ approaches. Our discussion targets areas of popular study in legal psychology, focusing on concealed crime knowledge detection, line up identification procedures, and investigative interviewing of witnesses, suspects, and victims. Each section features a description of current Japanese practice, followed by a review of the current state of the relevant academic legal psychological literature. We then connect this practice and research synergy to a reflection with suggestions for future research. Taken together, our paper acts as a conduit to incentivise more research and practice collaboration for Japanese and non-Japanese audiences and presents opportunities for collective international legal psychology.
Introduction
The aim of legal psychology is to produce evidence-based recommendations for improving the quality of justice systems. However, the majority of the research in this area is dominated by American and Western European experimental laboratory research. This research on perception, memory, and social processes has usefully influenced guidelines on topics such as how to evaluate the credibility of testimony or mitigate biases in legal decision-making around the world. However, we are not always aware of the upper limits on the utility of this literature to non-Western contexts, especially since applied research focusing on best practices in and around criminal investigations is mainly concerned with practices in western societies and is not routinely evaluated against systems across the world.
Our work here is the result of extended interactions between Japanese practitioners and academics and Western-trained academics to identify mutual benefits and our purpose is to highlight similarities and differences between Western and Japanese practices and facilitate cross cultural collaboration.
We do so by looking at procedures present in all countries such as interviewing and line-ups, but also techniques unique to the Japanese system, such as the Concealed Information Test. We start with summarizing investigative interviewing practices in Japan (取調べ - torishirabe), and distinguish between interviewing suspects, victims, and witnesses. Next, we discuss the Concealed Information Test, a technique that detects memory for crime details in suspects that is only used in Japan, and Japanese line up identification procedures (面割 - men-wari). We will begin each section by describing Japanese practice, followed by an overview of the scientific literature pertinent to that process. Together, both parts can aid scientists in identifying novel research problems and practitioners can gain insight in the scientific validation of modern practices. Finally, we propose future directions for scientific inquiry to promote collaboration between practitioners and scientists.
Investigative Interviewing (取調べ - torishirabe)
Japanese practice - suspects
The practice of interviewing suspects in Japan has recently undergone considerable reform. The Japanese National Police Agency (NPA) came under intense public scrutiny after miscarriages of justice were brought to light in the new millennium (Kamiya, 2010; Onishi, 2007). In all of these cases, innocent suspects were wrongfully convicted on the basis of coerced confessions (see Ito, 2012). In light of this, the NPA issued a policy with the aim of improving investigative interviewing practices. The policy stipulated that the timing and duration of interviews should be limited (e.g. no longer than 8 hours in duration, with no interviews occurring between 10 pm and 5 am), and that interviews should be supervised (Ito, 2012). In addition, the NPA established a committee to oversee additional improvements through the inclusion of evidence-based psychological approaches. In 2008, Japanese police forces began partially recording select investigative interviews on a trial basis—first in a handful of prefectures, then across the country. Full recording of investigative interviews for cases tried by a team of citizen (Saiban-In) and professional judges became mandatory in June of 2019, though cases tried in this manner are few in number (0.3% of all cases as of 2021; Japanese Ministry of Justice, 2022).
Perhaps the most significant step toward reform came in 2012, when the NPA launched a program to advance investigative interviewing methodology and published a basic training manual for investigative interviewing. The two-chapter manual includes information on the psychology of memory, false confessions, the Cognitive Interview, questioning styles, and how to avoid leading questions (Wachi & Watanabe, 2016). The manual was developed in collaboration with psychologists, and draws on approaches used in western nations, such as the PEACE model from England and Wales (PEACE being a mnemonic for the five stages of the model; Plan and Prepare, Engage & Explain, gather an Account, before bringing the interview to Closure, and then undertake Evaluation).
In 2013, the Research and Training Center for Investigative Interview and Interrogation Techniques was established in the National Police Academy. The Center provides training in investigative interviewing to police officers across Japan, using the manual mentioned earlier. The training takes place over an 8-day period and the participants (typically 10) are chief inspectors from prefectural police headquarters. Trainees are given the opportunity to engage in mock interviews and to critique real (video-recorded) interviews conducted in their prefectures. The trainees are then expected to communicate what they have learned to the front-line officers they supervise. The Center also offers training at regional police headquarters (Wachi & Watanabe, 2016). Masuda and Wachi (2018) examined mock interviews conducted by 45 trainees both before and after they received the training. They found that participants elicited more information from interviewees and were more likely to use open-ended questions after completing the training. These results support the efficacy of the training programme.
In current practice, suspects are first interviewed by the police. If the police decide that it is necessary to detain a suspect, they can be arrested for up to 72 hours (Art. 205 [2], Code of Criminal Procedure 1948: CCP 1948). Suspects must be referred to a public prosecutor within 48 hours of arrest. The prosecutor then has 24 hours to request a judge to detain the suspect for an initial period of ten days(Art. 207 and 208, CCP 1948), and may request an extension of ten additional days if necessary (Art. 208 [2], CCP 1948). Suspects may therefore be detained for questioning for a total of 23 days . Prosecutors and police officers may interview them repeatedly during this period. There are no clear rules for the number and timing of the interviews, or with regard to role sharing between police and prosecutors. However, in cases where the police have already obtained a confession, a prosecutor will usually interview the suspect only to assess the credibility of the confession. If suspects present with mental health problems, they may be assessed by a psychiatrist while in detention. It is also possible for prosecutors to obtain a warrant to detain these suspects for a longer period (usually a few months) to allow for a more detailed psychiatric evaluation. Suspects may engage legal counsel during the detention period, but defence lawyers are not permitted to attend or observe the interviews.
The typical approach to questioning involves allowing a suspect to first make a full statement, which may contain information that contradicts evidence held by the police. The police officer (and/or prosecutor) later may challenge this statement by presenting the contradictory evidence. This self-incrimination strategy is similar to the Strategic Use of Evidence (SUE) Technique. In the SUE technique, the interviewer first asks the suspect to provide a free recall of the event(s) in question. The interviewer then asks follow-up questions that address the evidence held against the suspect, but does not reveal this evidence until later (Hartwig et al., 2014). Research has found that truth-tellers tend to mention details related to the evidence in their free recall, whereas liars generally avoid doing so (Hartwig et al., 2005). Furthermore, the late disclosure of evidence by interviewers nearly doubles the likelihood that guilty suspects will make contradictory statements (Hartwig et al., 2014). Walsh & Bull (2015) also found that when investigators provide late (and indeed gradual) disclosure this approach tended to prompt more detailed accounts from suspects, whereas the early disclosure of evidence was significantly less effective in achieving this outcome. Indeed, the SUE techniques are taught during the Japanese training as well (Wachi, 2021).
In Japan, there are several regulations in place to protect the rights of suspects during an interview. For example, police and prosecutors are not allowed to present false evidence for the purpose of eliciting a confession (Wachi et al., 2014). Furthermore, suspects have the right to remain silent, though interviewers may continue to question a silent suspect (Wachi et al., 2014), and silence may be interpreted as inculpatory in court (Ito, 2012). According to the Japanese code of criminal procedure, a suspect cannot be convicted of a crime if a confession is the only incriminating evidence (Ito, 2012; Wachi et al., 2014). As previously mentioned, the recording of interviews has been mandatory for cases that are to be tried by a combination of lay judges (Saiban-In) and professional judges since June 2019. Prior to this, interviews were recorded in some cases (e.g. when suspects denied allegations, remained silent or confessed, and for suspects with intellectual disabilities, developmental disabilities and mental disorders). However, despite the introduction of extended recording requirements, the overall number of interviews which were recorded as of 2021 remains low (only 1.4% of all cases; Japanese Ministry of Justice, 2022). The reason for this is simply that the cases for which recording is mandatory remain very limited (e.g. those tried by Saiban-In, those involving mentally disordered suspects).
Much of the recent research on police interviewing of suspects in Japan has been carried out by academics at the National Research Institute of Police Science in Japan (NRIPS). In one study, Wachi et al. (2014) surveyed 276 Japanese police officers about the techniques they used to interview suspects. The various interview techniques reported by participants yielded five factors: presentation of evidence, confrontation, rapport building, active listening, and discussion of the crime. The researchers identified four styles of interviewing based on different combinations of these factors: evidence-focused, confrontational, relationship-focused, and undifferentiated. Full confessions and disclosure of new information were most likely to occur when interviewers used the relationship-focused style, even in cases when the suspect had initially been reluctant to confess. An evidence-focused style was associated with partial confessions–that is, interviewees confessed only to aspects of the crime for which the interviewer presented incriminating evidence. The confrontational technique was least effective, often leading suspects to become evasive (Wachi et al., 2014). In a follow up study, the researchers investigated prisoners’ experiences of investigative interviews (Wachi & Watanabe, 2016). They found that participants who had decided to deny allegations or were undecided about confessing prior to the interview were more likely to confess when questioned using the relationship-focused style, or an approach that combined aspects of the various interview styles (undifferentiated-high; Wachi & Watanabe, 2016). These results indicate that an approach characterized by active listening and rapport building, rather than one based strictly on the presentation of evidence or confrontation, has proven most successful in eliciting new information and full confessions from Japanese suspects.
What we know from the literature
Approaches to interviewing suspects generally fall into one of two categories: information gathering or accusatorial (Meissner et al., 2012). In the accusatorial approach, the suspect’s guilt is assumed, and the interviewer’s main purpose is to extract a confession. To achieve this end, the interviewer may use psychologically manipulative and confrontational techniques such as minimization and maximization. Minimization involves tactics such as downplaying the seriousness of the crime and its consequences, attempting to gain the suspect’s trust by behaving sympathetically, offering moral justification for the crime, and implying that the suspect will be treated with leniency should they choose to confess (Kassin, 2015; Madon et al., 2013; Meissner et al., 2014). By contrast, maximization involves intimidation of the interviewee, often by playing up (‘maximizing’) the seriousness of accusations and their potential consequences (Kelly et al., 2013). A growing body of research evidence shows that accusatorial tactics increase the likelihood of eliciting both true confessions from guilty suspects and false confessions from innocent ones (Kassin et al., 2010; Meissner et al., 2015).
The main goal of the information gathering approach is to extract accurate and reliable information through the use of open-ended questions and rapport building. Deception or psychological manipulation of the interviewee is thus inconsistent with this approach. There is overwhelming evidence that information-gathering approaches reduce the likelihood of false confessions, increase the elicitation of accurate information, and can even facilitate cooperation by interviewees who are initially resistant (Evans et al., 2013; Meissner et al., 2014). Given the recent adoption of an information gathering approach in Japan, we will focus on research relating to this approach (but see Vrij et al., 2017 and Meissner et al., 2014 for more information on the accusatorial approach and its risks to obtaining justice).
Many Western countries that use the information gathering approach to investigative interviews have developed techniques based on the PEACE model. Following this model, interviewers first plan for the interview. In the “Engage and Explain” phase, they inform the interviewee of the purpose of the interview and attempt to establish rapport with them. The interviewer then moves on to the most substantial phase, in which they try to elicit a complete and accurate account from the interviewee. This may be done using an approach known as Conversation Management (CM; Shepherd, 2007), or the Cognitive Interview (CI; Fisher & Geiselman, 1992; Geiselman et al., 1984). Following this, the interview is brought to a close and interviewers are meant to evaluate the interview to determine if further enquiries are appropriate (as well as assessing their own performance in that interview).
PEACE, as noted, discourages the use of manipulative or deceptive tactics to obtain a confession and encourages interviewers not to assume that the suspect is guilty. Such an assumption may lead interviewers to seek out and emphasize information that confirms the suspect’s guilt while ignoring potentially exculpatory evidence–a phenomenon known as confirmation bias. Soukara et al. (2009) analyzed recordings of 80 suspect interviews conducted by PEACE trained police officers in the UK. They found that these officers generally used an ethical approach to interviewing, and almost never used manipulative or coercive tactics such as minimization or maximization (also D. W. Walsh & Bull, 2012b, 2015). However, similar studies have also identified areas for improvement. Clarke et al. (2011) analyzed recordings of 174 interviews with suspects conducted by six different police forces across England and Wales. They found that police performed poorly in the ‘engage and explain’ and ‘closure’ stages of the interview and did not develop adequate rapport with suspects (similar to the findings of D. W. Walsh & Bull, 2010; D. W. Walsh & Milne, 2008). The authors concluded that PEACE training seemed to improve interviewers’ “soft skills” such as self-confidence and communication skills, and their focus on explaining procedural issues to the suspect (Griffiths & Milne, 2006). However, the police interviewers needed to further develop complex skills for processing information quickly, making linkages, and challenging inconsistencies in the suspect’s account.
In a similar study, Walsh & Bull (2010) found that interviewers who demonstrated good skills in all stages of the PEACE model elicited more complete accounts and were more likely to elicit full confessions from suspects than less skilled interviewers. However, many interviewers lacked assertiveness and engaged in little preparation and planning, leading the researchers to suggest that re-training is necessary for such interviewers (D. W. Walsh & Bull, 2010). In sum, while the implementation of national PEACE training for police in the UK has had positive effects on investigative interviewing practices, more extensive training may be needed for thorough and lasting improvements at an individual level, coupled with a strategy that underpins transference of skills learned in the classroom to be sustained in the field (such as good supervision).
Developing interviewers’ rapport building skills through additional training may be especially important. Rapport building has come to be perceived as essential to a successful investigative interview (Vallano & Schreiber Compo, 2015; D. W. Walsh & Bull, 2012a). Rapport has been defined as mutual attentiveness, positivity, and coordination within a dyadic interaction (Tickle-Degnen & Rosenthal, 1990). While this definition may aptly describe rapport in the context of a cooperative interview (such as with a victim or witness), Vallano and Schreiber-Compo (2015) note that rapport in suspect interviews is more likely to be characterized by conformity and accord. Rapport building is a component of the Engage and Explain stage of the PEACE model, and features in other interview protocols (see section on interviewing victims and witnesses in this paper). Many rapport-building techniques have been identified, such as using the interviewee’s name, engaging in self-disclosure, displaying active listening, and showing kindness, respect, and concern for the interviewee (but see Fisher & Geiselman, 1992; Kelly et al., 2013; and D. W. Walsh & Bull, 2012a for more). The efficacy of these techniques is likely to differ based on the characteristics of the interviewee and the situation (Vallano & Schreiber Compo, 2015).
Research that specifically examines how rapport affects interview outcomes is limited. However, one study found that rapport increased witnesses’ recall accuracy in response to open questions (Collins et al., 2002). The results of another study, conducted in Japan, showed that interviewers who built rapport with interviewees elicited more information than interviewers in a no-rapport control condition (Yamamoto et al., 2016). Further, Wachi et al. (2018) used the modified cheating paradigm in Japan and found that mock suspects were more likely to confess when their interviewers included a rapport building phase (such as having a chat with the interviewer). Similarly, Walsh and Bull (2012a) found that interviewers who were skilled at maintaining rapport during the ‘Account’ phase of PEACE elicited more details and were more likely to elicit full confessions from suspects. However, they also found that interviewers often missed opportunities to build rapport in the initial stages of an interview, and opportunities to maintain rapport at later stages. Moreover, interviewers who established a satisfactory level of rapport in the ‘Engage and Explain’ phase did not necessarily maintain this level in the ‘Account’ phase. These results demonstrate that rapport is dynamic and requires maintenance throughout an interview (Abbe & Brandon, 2013; D. W. Walsh & Bull, 2012a). Furthermore, Izotovas et al (2021) found that interviews conducted by the police in England and Wales commenced satisfactorily with officers attempting to develop rapport, but (in the face of suspects’ non-cooperation) moved increasingly towards a more confrontational approach, replete with a questioning strategy that yielded less information. Such attempts to overcome these suspects’ resistance were always found ineffective.
Observational studies of interviewing practices, such as those we have just discussed, are possible because recording interviews has become commonplace in several Western countries. In places where recording interviews has been met with resistance (such as in some U.S. states), those against adopting the practice argue that it might influence the freedom with which interviewees divulge information. However, research has shown that this is not the case. One study found that suspects who were informed that that they were being recorded spoke as often, and were just as likely to make admissions and confessions as suspects who were not informed (Kassin et al., 2019). Similarly, Wachi et al. (2018) found that the presence of a camera had no effect on mock suspects’confessions. In another study, informing interviewers that interviews were being recorded led them to use fewer minimization and maximization tactics. Furthermore, interviewees (who were oblivious to the recording) perceived that recording-aware interviewers made less of an effort to elicit a confession (Kassin et al., 2014). Thus, recording increases the accountability of an interviewer, with no apparent disadvantages for information elicitation. As previously mentioned, the Japanese National Police now fully record interviews for cases to be tried by Saiban-In from the moment a suspect enters the room, in line with recommendations from the Western literature. However, the number of cases for which recording is mandatory remains low, and the scope of this requirement should be extended to cover all interviews.
Future directions
Prior to recent developments, confessions were considered the most important pieces of evidence in Japanese criminal trials (Wachi & Watanabe, 2016). This may have led Japanese police and prosecutors to place an emphasis on seeking confirmation of allegations, which is a hallmark of the accusatorial approach (Meissner et al., 2012). The psychologically coercive tactics associated with this approach have been shown to increase the chances of eliciting a false confession (Meissner et al., 2010, 2014). It is therefore very positive that the Japanese National Police training program encourages an information-gathering approach. Further research is needed to examine whether the emphasis on securing a confession has decreased since the introduction of the NPA’s training programme. Wachi et al., (2014; Wachi & Watanabe, 2016) have made strides in this area, but their work is based on self-report data. As these researchers rightly note, observational research on police interviews in Japan is lacking, and is much needed. The relatively new practice of recording suspect interviews, though limited in scope, might offer an opportunity for the coding and empirical observation of interviewer practice.
Whether this change has had any positive impact on interviewing practice remains unknown (given the stated absence of observational studies of the police in Japan). However, we do know that training in information gathering strategies, together with both the electronic recording of interviews and laws prohibiting unethical practices (such as lying to suspects) have been found important in other jurisdictions around the world in moving interviews from a confession orientation towards an information gathering one. In so doing the number of miscarriages of justice due to poor police interviewing has decreased dramatically (see Poyser et al., 2018).
However, this change has tended to have been achieved when combined with two other significant factors. Firstly, that such interviews are conducted within a wider criminal justice process that does not emphasise (or even rely on) confessions as a means of resolving cases. Secondly, such a change is more likely to succeed when the most senior personnel in police forces do not show tolerance for confession focused approaches. Instead, they endorse and champion information gathering techniques. That is, the predominant culture is not that of viewing police interrogations with suspects as confession-seeking opportunities.
Another issue that merits examination is the length of time suspects are interviewed and spend in detention in Japan. Western evidence suggests that prolonged detention and repeated questioning can cause suspects to feel isolated and stressed, and they may falsely confess as a means of hastening its conclusion (Kassin et al., 2010). By law suspects can be arrested for 72 hours and further detained for up to 20 days during which as many interviews may occur as the investigators see fit. Suspects have a right to see a defence lawyer while in detention (Art. 39, CCP 1948), but they have no right to a phone call. Data from 2021 provides a glimpse at detention lengths in Japan following the 72 hours of arrest (Public Prosecutors Office, 2022). In total, 83841 suspects were detained in 2021. 1.44% were detained for 5 days or less while 30.94% were detained for 10 days or less. 4.61% were detained for 15 days or less while the majority of 62.90% were detained for the maximum period of 20 days or less, after an extension of 10 days had been granted. The data also suggests that there is a 99.67% likelihood that an extension request is granted. A survey conducted by the Japanese National Police in 2011 examined interview statistics (NPA, 2011). For common crimes, suspects were interviewed 10.1 times for a total of 15 hours and 15 minutes on average. The interviews occurred over 5.7 days on average. The average interview time per day was 2 hours 41 minutes split over an average of 1.8 interviews with an individual average length of 1 hour 31 minutes. For serious crimes, suspects were interviewed 41 times for a total of 65 hours and 31 minutes on average. The interviews occurred over 17.6 days on average. The average interview time per day was 3 hours 43 minutes split over an average of 2.3 interviews with an individual average length of 1 hour 36 minutes (NPA, 2011). It is noteworthy that the number of interview days does not imply consecutive interview days, as the days could be spread in any manner over the maximally allowed 23 days. By contrast, in the UK, individuals suspected of a serious crime may be held in police custody for a maximum of 4 days (96 hours) before being either formally charged or released (14 days for those arrested under the terrorism act; PACE, s. 41-44). Researchers in the UK, Canada, and the U.S. have reported averages of 23, 33, and 100 minutes per interview respectively (C. Clarke et al., 2011; Kassin et al., 2007; Snook et al., 2012). Western research has shown that interviews ranging from 6 to 24 hours in length are associated with false confessions (Drizin & Leo, 2004). Given concerns about extended interviews and detention time raised in Western research contexts, future research should evaluate the impact of these variables in Japanese contexts.
Finally, rapport-building techniques that are supported by findings from the western literature may not necessarily be the most optimal techniques in the Japanese cultural context. Research is needed to establish which approaches to rapport building are most effective with Japanese interviewees. Future research should also examine the role of Japanese public prosecutors in the interview process. A public prosecutor’s approach may differ from that of a police officer, and their collaborative efforts during questioning may have as of yet unexplored effects on a suspect’s willingness to disclose information.
Japanese practice – interviewing victims & witnesses
As is the case with suspects, victims and witnesses are first interviewed by the police. Prosecutors may then also conduct an interview to assess the credibility of the victim or witnesses’ statement. Interviews with victims and witnesses are only recorded in certain cases. For example, interviews with minors are always recorded. Other cases in which interviews are recorded include those in which it is deemed the recording may later serve as important evidence. For example, interviews with victims of alleged domestic violence or sexual offences are often recorded in case the victim later recants the confession (perhaps as the result of intimidation by the suspect).
Most of the recent developments in witness and victim interviewing in Japan have been focused on the interviewing of children. Cases of child abuse are usually first identified by social workers and psychologists at Child Guidance Centers (CGCs). Children must then be interviewed by a police officer and later, a prosecutor, who can decide whether the case should go to court. In Japan, statements given outside of court are generally not admissible as evidence (hearsay rule; Tanaka, 1952). Only the prosecutors are exempt, in that they may present a written statement they have overseen as admissible evidence in court. It is therefore necessary that the police and prosecutors interview children alleging abuse. Recently, efforts have been made to limit the number of interviews a child must give by conducting joint multidisciplinary interviews involving the police, prosecutors, and psychologists or social workers if necessary (Naka, 2016). These joint interviews are an important step towards making the interview process less protracted and traumatic for child victims.
In 2011, a training program for interviewing children was developed by Japanese academics for social workers and other professionals (Naka, 2016). The program is based on the National Institute for Child Health and Development (NICHD) protocol (Lamb et al., 2008). The NICHD was designed specifically for use with child interviewees and is based on decades of developmental and cognitive psychological research on child memory. The protocol encourages the use of open-ended questions, setting ground rules (e.g. encouraging the child to tell the truth), rapport building, training episodic memory, and asking the child to indicate when they do not know something. The NICHD has been rigorously tested, has been used in several countries (including Japan), and has proven an effective tool for eliciting accurate and detailed accounts from children (La Rooy et al., 2015; Lamb et al., 2007). In addition to the NICHD-based training, the NPO provides training for interviewing children called Child First Japan. This training takes place over 5 days (40 hours) and is usually given to groups of 20 professionals. These professionals may include police officers, prosecutors, lawyers, medical personnel, child welfare officers and child psychologists.
What we know from the literature
Investigative interviews are a crucial component of criminal investigations (Milne & Bull, 1999). A successful investigative interview is one that elicits the disclosure of enough information to successfully prosecute offenders, or protect the innocent (Dando & Oxburgh, 2016). Several interview protocols exist to advance this aim. Among these, the Cognitive Interview (CI) has received considerable empirical support (Fisher & Geiselman, 1992; Memon et al., 2010). The Cognitive Interview was developed by Geiselman and colleagues in 1984, and revised by Fisher and Geiselman in 1992 (Enhanced Cognitive interview, or ECI) to include additional instructions for effective communication between interviewer and witness (e.g. rapport building). At the core of the CI are four mnemonic instructions: Mental Context Reinstatement (MRC), Report Everything (RE), Change Perspectives (CP), and Reverse Order (RO). The ECI begins with a rapport-building phase, during which the interviewer engages the interviewee in neutral topics to establish comfortable relations and facilitate communication. The interviewer then moves into the interview phase, applying mnemonic techniques from the ECI toolbox as deemed necessary. The interviewee-centered nature of the ECI and its emphasis on rapport building, among other considerate behaviors, adds a distinctly humane element to the approach. The instructions encourage interviewers to ask primarily open-ended questions, not interrupt the witness, instruct the witness not to guess, use non-verbal affirmations to support the witness, and keep note taking to a minimum. Not only do these procedures facilitate recall, they may have a therapeutic effect on the interviewees’ well-being (Fisher & Geiselman, 2010). The most recent meta-analysis of studies comparing the ECI to standard approaches showed that the former produced a large and significant increase in correct recall with only a slight increase in incorrect recall (Memon et al., 2010). The benefits of the ECI decreased but remained significant as retention interval increased.
Though research demonstrates that the ECI is an effective interview protocol, it is not without its limitations. Practitioners are often reluctant to use ECI techniques due to lack of appropriate training, or time constraints when working in the field. When conducted appropriately, an ECI can be lengthy. When there are multiple witnesses to a crime, limited police resources make lengthy interviews impractical. One study found no evidence of the ECI having been used in 83% of British interviews examined (C. Clarke & Milne, 2001). Dando, Wilcock, & Milne (2008) reported that police officers felt that most of the interviews they conducted were for less serious crimes and did not warrant the complex and time-consuming ECI procedure. Fortunately, psychologists have been working to adapt the ECI toolkit, and recent work has attempted to address some of the issues that have arisen in the field. Davis, McMahon, and Greenwood (2005) developed a modified version of the ECI. This Modified Cognitive Interview, or ‘MCI,’ includes only ECI mnemonics that have received the most empirical support in terms of efficacy–namely Mental Reinstatement of Context and Report Everything. Milne and Bull (2002) found that a combination of the MRC and RE instructions resulted in increased recall for adults and children when compared to individual use of the Change Perspective and Reverse Order mnemonics. Additionally, CP and RO have been rated by police as the least useful and most rarely used of the ECI techniques (Kebbell et al., 1999; Wheatcroft et al., 2013). Davis et. Al., (2005) found that the MCI elicited approximately 87% of the information gleaned from an ECI, with a time saving of 23%.
Hope, Mullis, and Gabbert (2013) expanded on the MRC and RO mnemonics of the CI to develop a timeline reporting procedure. In the procedure, participants are given a cardboard timeline along with person description and action cards. They are then instructed to report their recollection of a witnessed event in the correct temporal order, indicating clearly which actions are related to which perpetrator. Hope et al. (2013) found that participants using the timeline technique made fewer sequencing errors and provided significantly more correct details about the actions of individual perpetrators in a multi-perpetrator crime than participants in control conditions. The timeline technique also facilitates the reporting of verbatim details overheard from conversations (Hope et al., 2019). This technique seems to be a promising approach for eyewitness interviewing in multi-perpetrator crimes and investigations where the recall of conversations is particularly critical.
In order to address the issue of time constraints when it comes to applying the ECI in practice, Dando et al., (2009) attempted to simplify the MRC instruction by developing a Sketch Plan Mental Reinstatement of Context (Sketch-MRC). In the Sketch-MRC technique, interviewees are asked to draw everything they can recall from an event in as much detail as possible, while describing each detail drawn to the interviewer. Dando et al. (2009) found that participants interviewed with a Sketch-MRC instruction provided an equal amount of accurate information as those interviewed with the MRC instruction, without an increase in incorrect recall. Furthermore, participants interviewed with the Sketch-MRC were less prone to confabulation. Participants interviewed using the Sketch-MRC and MRC instructions both outperformed those interviewed with no MRC instructions. The sketch-MRC is also beneficial when used in interviews with typically developing children and children on the autism spectrum, as well as the elderly (Dando et al., 2020; Mattison et al., 2018). Replacing the MRC instruction with the Sketch-MRC makes the ECI slightly shorter, which may be beneficial in interviews conducted under time pressure (Dando et al., 2009).
Another development that addresses time and other resource constraints associated with having to interview multiple witnesses/victims is the Self-Administered Interview (SAI). The SAI is self-administered paper booklet form of the ECI that witnesses/victims can complete at the scene of the crime (Gabbert et al., 2009). The SAI facilitates immediate reporting of a crime without demanding much of police resources. It is therefore thought to be particularly useful where there are many witnesses in order to gain their first account before any influences of co-witnesses and/or time delays that may compromise their initial memory of the incident (Gabbert et al., 2012). Laboratory studies have found that participants interviewed with the SAI provide more accurate details, remember more information after significant time delays, and are more resistant to misinformation and misleading questions than participants who are not interviewed with the SAI (see Hope et al., 2011 for a review).
Matsuo and Miura (2017) tested a Japanese translation of the SAI with witnesses to a mock crime. They found that witnesses who completed a full version of the SAI (which includes a sketch component) reported more correct details than those who were interviewed with the ECI or free recall instructions. Furthermore, witnesses who completed the SAI immediately also reported more correct details than those who completed the ECI after a delay. Matsuo and Miura (2017) therefore concluded that the SAI can be very useful in time critical situations. They also suggested that the SAI may be beneficial in situations where a victim or witness does not speak the interviewer’s language, and translators are not immediately available to aid in an ECI. This scenario is not unlikely, as it has been noted that there is a shortage of trained interpreters in Japan (Habuchi, 2016).
The SAI may also prove beneficial for facilitating the increasing number of interpreter-assisted interviews. Research has found that interviewees who speak with the help of an interpreter tend to disclose fewer details (Ewens et al., 2016). In addition, interpreters can disrupt an interviewer’s verbal strategies, such as by replacing neutral words with suggestive or accusatory words (or vice versa) and omitting intentional pauses (Heydon & Lai, 2013; Lai & Mulayim, 2014). The effects of such alterations are not yet fully understood. Lai and Mulayim (2014) have recommended that interpreters should receive training and become specialized if they are to interpret investigative interviews. Validated translations of the SAI may therefore prove useful for eliciting either a primary or secondary comparative account in investigations involving interpreters.
Adaptations of the ECI have successfully been used to increase the reporting of accurate details by children and the elderly (Hayes & Delamothe, 1997; Holliday, 2003; Prescott et al., 2011; Wright & Holliday, 2007). The ECI has also proven highly effective when used to interview adults and children with intellectual disabilities (J. Clarke et al., 2013; Gentle et al., 2013), however, it appears to offer no distinct advantages for interviewing individuals with autism (K. L. Maras & Bowler, 2010, 2012). Furthermore, recent work has also shown that individuals with mental health conditions report more accurate details when they are interviewed using closed questions, as opposed to the open questions emphasized by the ECI (Farrugia & Gabbert, 2020). More research is needed in both the UK and Japan to determine whether alternative approaches may be beneficial for use with vulnerable victims, witnesses, and suspects, as well as those on the autism spectrum.
The identification of individuals who may need additional support during an investigative interview can often be difficult. Frontline police officers often operate under time and other situational constraints that may further impede their ability to pick up on cues that an interviewee is struggling to understand or interact with them. In the UK, Ali and Galloway (2016) developed the RAPID assessment, a quick and simple measure for identifying individuals with intellectual disability. Similarly, the NRIPS, in collaboration with the National Center of Neurology and Psychiatry (NCNP), has developed N2-FAST, a brief screening tool that can be used to identify interviewees with intellectual disabilities (Watanabe et al., 2016). However, the N2-FAST training has yet to be delivered to police officers in Japan. Furthermore, the N2-FAST and RAPID are not nationally mandated in the UK or Japan.
It is important to consider that traumatized victims are also vulnerable during interviews. Castelfranc-Allen (2015) developed a promising approach to interviewing such victims. This two-stage approach, known as Visual Communication Desensitization (VCD), involves first asking interviewees to rate the level of distress they experience across a narrative graph of the incident. The interviewer then plans the interview in a way that allows them to ease into the more distressing aspects of the event. Throughout the interview, the interviewer maintains sensitivity to the interviewee’s distress level and asks only open questions to elicit information. The second stage of the VCD procedure involves a therapeutic component, based on the cognitive-behavioral approach (CBT). CBT techniques, such as systematic desensitization, are used to ease the interviewee’s distress when recalling traumatic aspects of an event. VCD is intended to facilitate accurate and detailed recall of a traumatic event by easing an interviewee’s distress. An initial test of the VCD approach has shown positive results (Castelfranc-Allen & Hope, 2018).
In addition to the new interview formats and variations on existing ECI mnemonics we have discussed so far, several new retrieval support techniques have been developed to add to the ECI toolbox. One such technique involves asking interviewees to close their eyes. Eye closure has been found to increase memory for visual and auditory details reported in either free or cued recall, with no concomitant increase in incorrect details (Perfect et al., 2008).
Vredeveldt et al. (2011) found that eye closure facilitates enhanced recall by decreasing cognitive load, thereby freeing resources that would have been otherwise engaged in monitoring the environment. Eye closure is a simple technique that does not necessitate any special training on the part of the interviewer and confers mnemonic benefits. Another recently developed mnemonic is category clustering recall (CCR, Paulo et al., 2016). With CCR, interviewees are asked to organize their recall into crime-relevant categories (e.g. person, object, action, sound, conversation and location details). Initial research suggests that CCR increases interviewees’ recall of accurate information both as an independent strategy (cf. free recall) and in combination with the ECI (Paulo et al., 2017; Thorley, 2018).
Finally, Wheeler and Gabbert (2017) proposed a retrieval support mnemonic known as the self-generated cues (SGC) technique. SGC involves asking participants to list the six most salient details of an event and to then focus on each of these in turn. These salient details, or self-generated cues, are meant to prime the recall of less salient but associated information through the process of spreading activation (Anderson, 1983). Kontogianni et al. (2018) tested the efficacy of the SGC technique in combination with the Timeline technique. They found that participants who used SGCs reported more correct details in a full attention condition than participants who did not use SGCs, without an increase in incorrect details. While these new mnemonics seem promising, further research is needed to support their utility, particularly with special populations such as the vulnerable.
Future directions
The Enhanced Cognitive Interview (ECI) has withstood rigorous empirical tests and continues to be adapted and expanded upon through exciting and promising new developments. However, the merits of the ECI may not translate to its application in the field due to the time required to both train and deploy the technique. It is therefore imperative that researchers attend to the training and practical needs of investigative interviewers in order to adapt this tool accordingly. Furthermore, newly developed mnemonics and approaches to identifying and interviewing vulnerable witnesses should be validated with Japanese interviewees. Similarly, the long-term efficacy of the National Institute of Child Health and Development (NICHD)-based training delivered to an increasing variety of practitioners in Japan should also be assessed, and periodically re-assessed.
Many of the avenues that need to be explored when it comes to interviewing witnesses and victims also apply to interviews conducted with suspects. These include cultural considerations related to developing rapport with Japanese suspects, victims, and witnesses. Moreover, there are unique aspects of Japanese society that should influence future investigative interviewing research. For example, the United Nations world Population Prospects (2019) indicates that 28% of people in Japan are aged 65 and over, a percentage that is projected to increase to 38% by 2050. It is therefore necessary that Japanese psychology and law researchers consider older participant samples in their work on investigative interviewing (see Yokota et al., 2019 for one such study). A recent study also predicted that lower birth rates in the coming decades will result in Japan having a population decline of over 50% between the years 2017 to 2100 (Vollset et al., 2020). Therefore Japan, like many countries facing similar declines, will likely need to attract a high number of migrants to sustain the economy. As such, there should be an increase in research examining the dynamics of interpreter-assisted interviews with foreigners in Japan. Such work is necessary to insure equal access to justice for all in a rapidly changing society.
Finally, what remains unknown is whether the police or public prosecutors are either sufficiently skilled or knowledgeable to detect developmental disorders in their interviewees, such as autism. Even if they do identify such vulnerabilities, an absence of research remains as to what are good practices when interviewing such people (but see K. Maras et al., 2020), both in understanding interviewees’ behaviours and indeed, how to interview them effectively (see D. Walsh et al., in press).
Concealed Crime Knowledge detection
Japanese practice
A feature unique to the police investigation in Japan is the systematic use of concealed knowledge detection tools. The underlying idea is simple. The suspect is tested for factual information related to the crime that only the perpetrator can know, e.g. what weapon was used in an attack. If a memory trace is detected, the suspect must explain in the subsequent interview how s/he has become aware of this knowledge. In the absence of a plausible excuse, the best explanation is that the suspect is the perpetrator.
Today, the Japanese police uses the Concealed Information Test (CIT; originally known as Guilty Knowledge Test; Lykken, 1974) undertaking over 5000 examinations a year (Hira & Furumitsu, 2002; Osugi, 2011). In the CIT the examinee is presented with a question about the crime and possible answers in a multiple-choice format. For example, the question “What was the murder weapon?” could be paired with answers such as “gun” or “knife”. A question typically has more than five plausible answers, of which only one is correct. The answers are presented one by one while the examinees physiological responses are measured. In Japanese practice this typically means measures of the autonomic nervous system, such as sweat (skin conductance response), or respiration. Examinees with and without knowledge are told apart as follows. An examinee aware of the correct answer displays a uniquely elevated physiological response when recognizing it during the presentation of the alternatives. In contrast, examinees unaware of the correct answer respond equally to all answer alternatives. Hence the elevated differential response to the correct answer is seen as indicator for concealed knowledge1.
The typical CIT examination is embedded within the police criminal investigation and often occurs before an individual suspect is formally questioned as a suspect. It is requested by the lead police investigator and conducted by a forensic science specialist, in Japan referred to as polygrapher (not be confused with polygraphers in western countries who usually use a version of the Control Question Test, a fundamentally different procedure, and rarely the CIT). The polygrapher then examines the crime scene and gathered evidence. Based on this information the polygrapher designs questions about the crime to which the perpetrator is likely to know the answer. It is important for the expert to keep memory decay in mind (see earlier discussion regarding the Self-Administered Interview), it has been long established in the research that delays of one week or more can already have led to severe reductions of memory for an event (Berman et al., 2009; Brown, 1958; Lewandowsky & Oberauer, 2009). Each question features the correct answer, extrapolated from the evidence, and typically four equally plausible incorrect answer alternatives. The polygrapher themself determines whether the set of answers is equally plausible for the examinee (i.e. the suspect).
First, the CIT procedure is explained and the “card test” is performed. The examinee picks one of five playing cards and memorizes it, followed by a mock CIT question about which card they drew with the possible playing cards as answer alternatives. The purpose of the card test is to familiarize the examinee with the test procedure and to use the collected data as reference when interpreting the actual test results. The examiner can be certain that the examinee is aware of the correct answer (as opposed to the questions related to the crime), so they can determine the responsivity of the measurements for this examinee. For example, so called “non-responders” show no discernible skin conductance response (for unrelated reasons), which is important to know before evaluating the real test data.
Then the real examination begins. Typically, each question and all associated answer alternatives are presented to the examinee first. Now, the examinee can indicate that they do know the correct answer and the question is discarded, as the only purpose of the CIT is to detect if the knowledge is present not why. It is up to the examinee to provide a plausible explanation on how s/he acquired this knowledge to the investigator during the subsequent interview. Each question is repeated three to five times with varying order of answer alternatives before moving on to the next question. In the end, the polygrapher evaluates the examinee’s responses per question, by interpreting the physiological data collected over all repetitions. The polygrapher typically makes a binary judgement [“The examinee did (not) know the correct answer.”] per question. Another possibility is that the examination result is judged inconclusive. Currently, no formal decision rules exist, so decisions are based on the expert’s holistic judgement.
The CIT can also be used as an information gathering tool. It is referred to as ‘searching CIT’ (sCIT) and is frequently used to aid the criminal investigation in Japan (Matsuda et al., 2012, 2019). The sCIT follows the same procedure of the CIT but operates on the opposite premise. It assumes the examinee is the perpetrator and asks questions about the crime whose answers are unknown to the investigators. Therefore, the polygrapher must create a set of answers that are likely guesses, hoping that the correct answer is present in the set. Just as in a normal CIT, a knowledgeable examinee would be expected to respond differently to the correct answers, hence revealing it to the investigators. However, the sCIT’s effectiveness as an intelligence gathering tool rests on the assumption that the examinee is the perpetrator and that the correct answer is among the reasonable guesses made by the polygrapher.
The polygraphers in Japan learn how to conduct the CIT through a basic training course provided at the Training Center of Forensic Science, affiliated with the NRIPS (Osugi, 2011). All prospective forensic examiners, many of whom majored psychology, receive the basic training course typically for about three months. In the basic training course, they at first learn fundamental knowledge of psychophysiology, statistics, and other related fields. Then they learn the CIT principles, how to create CIT questions, and how to measure and analyze psychophysiological indices. Without the certificate of finishing this course, they cannot conduct the CIT in the field.
What we know from the literature
Despite being predominantly applied by the Japanese Police, the CIT has been studied in the scientific community all over the world. Studies in this field typically feature a mock crime that induces factual knowledge in half of the participants and subsequently a CIT examination is conducted to detect this knowledge. Research typically differs from Japanese practice in some details (Ogawa et al., 2015). In practice, the forensic expert repeats each question three to five times and makes a judgement per question. In contrast, empirical studies typically present a question only once but evaluate the CIT’s effectiveness over all questions. A reason why Japanese practitioners decide per question is that suspects may have forgotten some, but not all details. Hence, responses may vary over different questions and conclusions are drawn per individual question.
The diagnostic accuracy of the CIT has been established over numerous studies (Ben-Shakhar & Elaad, 2003; Meijer et al., 2014). In their meta-analysis, Meijer and colleagues (2014) summarize the discriminant ability of several psychophysiological indices and found very large effect sizes for autonomic nervous system measures (e.g. heart rate or respiration) and the P300 brain potential making the CIT an effective technique at detecting concealed knowledge. Furthermore, Zaitsu (2016) strengthened the external validity of CIT research by demonstrating that responses on the card test do not differ between laboratory and field settings. Neither the setting, a real CIT examination at the police station as opposed to a laboratory appointment, nor the sample, a real crime suspects from the general population as opposed to university undergraduate students, had lead to a difference on the card test. Furthermore, unlike typical field studies, the ground truth for the card test is known, so the accuracy of the card test in both conditions can be determined without a doubt. Hence, Zaitsu (2016) reduces the concern that accuracy estimates based laboratory CIT research do not properly reflect the complexity of a real CIT examination in practice, though further corroboration is still desired.
A strong theoretical foundation is expected of a tool, such as the CIT, to be applied in the field. Originally, the explanation of the CIT (Lykken, 1974) was based on the Orienting Response (OR; Sokolov, 1963). In short, the OR postulates that unusual and/or significant stimuli elicit larger physiological reactions. In the CIT, the correct answer stands out from the others and therefore elicits a larger physiological reaction only in those who know it. Another explanation for the CIT effect is the Arousal Inhibition Theory (AIT; see Verschuere et al., 2004), which essentially suggests that the observed elevated physiological responses are a by-product of countermeasures, actions to mask recognition of the correct answer. Recent efforts to disambiguate the memory recognition (OR) and deceptive responding (AIT) suggest that it is neither one nor the other process exclusively. Rather, that some indices respond to the memory recognition aspect of the CIT paradigm while others are sensitive to deceptive responding (Klein Selle et al., 2016, 2017, 2019; Verschuere et al., 2007).
Practical concerns of both test application and construction are another core research area. One consideration in test construction is the effects of memory encoding and memory decay over time. For example, as already noted a common finding is that memory for an event fades over time, especially for peripheral information. Peripheral details are present at (but not always crucial to) an event and are frequently forgotten after delays of one week or more (e.g. Nahari & Ben-Shakhar, 2011). That means this type of information should be avoided when constructing a CIT question because the perpetrator is unlikely to remember such peripheral details and hence is not distinguishable from any other (including innocent) examinee (Nahari & Ben-Shakhar, 2011; Peth et al., 2012; Seymour & Fraynt, 2009).
Other effects on encoding involve emotional arousal. Osugi and Ohira (2017) and Osugi and Ohira (2018) have both demonstrated that details encoded in a highly aroused state resulted in larger physiological responses during a CIT examination, suggesting that pieces of information experiences in states of high arousal may yield better diagnostic accuracy. Another concern is how the answers within one set affect each other. Previous studies have investigated the effect of stimulus generalization between categories and exemplars, e.g. Category: Car; Exemplar: Honda, and similarities between answer alternatives (Ben-Shakhar et al., 1995, 1996; Ben-Shakhar & Gati, 1987). The evidence suggests that generalization occurs, meaning items more similar to the correct answer elicit relatively larger responses. However, this may depend on the original encoding level (Geven et al., 2019).
Similarly, Ambach, et al. (2010) demonstrated that stimulus modality, verbal or pictorial stimulus presentation, did not affect the CIT’s diagnostic accuracy. Experiments such as these are crucial in finding the optimal parameters for test construction which are essential for valid application in practice.
Changes to the traditional procedure or alternative indices are another core research area. While these techniques vary profoundly in the paradigm and/or indices used, the core principle that concealed knowledge is inferred through differential responses to crime facts, remains the same. As each measure and paradigm features their own wealth of empirical background, we limit our description here to an overview of the respective techniques. First, the Guilty Action Test (GAT; Gamer et al., 2010) represents a slight modification to the standard CIT procedure, shifting the focus of the question on the examinee. For example, the question “What was stolen?” could become “What did you steal?”.
Evidence on the superiority of the GAT over the CIT is still inconclusive (Gamer, 2010). Research on the sCIT has focused on finding out details of potential future attacks. For example, Meijer, et al. (2010) could identify the target, location, and date of a fictional terror attack participants were asked to memorize. Similarly, Meijer, et al. (2013) demonstrated the same effect when multiple members of a fictional terror group were tested simultaneously. Next aside from the typical measures of the autonomic nervous system, diagnostic variations of the CIT have been developed relying on brain activity measured by functional Magnetic Resonance Imaging (fMRI; Gamer et al., 2000; Nose et al., 2008) and electroencephalography (EEG; Farwell & Donchin, 1991; Miyake et al., 1993; Rosenfeld, 2019). Research on the latter is more common and focuses on the well-researched P300 event related potential (Osugi & Ohira, 2017, 2018) using a three stimulus oddball paradigm, slow waves (Matsuda & Nittono, 2018) or the so-called Complex Trial Protocol (Lukács et al., 2019; Rosenfeld et al., 2017). In addition, several alternatives to physiological measures exist to detect concealed knowledge. For example, the Response Time CIT (e.g. Lukács & Ansorge, 2019; Verschuere et al., 2015), which is based on a stimulus-response incompatibility. With similar diagnostic accuracy as autonomic nervous system measures (Verschuere et al., 2010), it has been proposed as a viable alternative (Seymour et al., 2000) that is more efficient in terms of time taken to administer (and thus is a less costly procedure). Another novel method is the detection of concealed knowledge using eye-tracking (Lancry-Dayan et al., 2018; Peth et al., 2013; Verschuere et al., 2007). For example, Lancry et al. (2018) could detect a familiar face among a selection of faces using the number of eye fixations and length of these fixations. Lancry et al. (2021) provided some further replication, but more validation is needed. Finally, concealed knowledge can also be detected with the Forced Choice Test (FCT). The FCT can be applied to those cases of feigned crime amnesia (Denney, 1996; Orthey et al., 2017), which reflects the same premise of the CIT. That is, the examinee is presented with questions about the crime in a multiple-choice format and the test also concludes whether the examinee is knowledgeable about the crime. In the FCT each question about the crime features only two possible answer alternatives, of which only one is correct, and the examinee is forced to indicate the correct one. Those without any knowledge of the crime can only guess and produce total scores within chance level, while those with concealed knowledge tend to avoid correct information on purpose. This avoidance strategy is known are underperformance and used as criterion for concealed knowledge (Bianchini et al., 2001). A limitation of the FCT is that it requires a relatively large amount of evidence to construct a valid test, however, recent evidence suggest that valid FCTs can be created with the same amount of evidence as typically used in a CIT examination (see Orthey et al., 2022), yet further replication is needed first.
Future Directions
Despite the wealth of empirical studies, further research is needed to address challenges with CIT use. For one, an important issue for the CIT is the use of countermeasures. Countermeasures are those behaviours that examinees actively deploy to distort the measurement and hide their knowledge. Information on tests such as the CIT or other techniques developed in forensic psychology are readily accessible with high accuracy and detail on the internet. Countermeasures are also easy to learn and effective, leading to a severe reduction in diagnostic accuracy (e.g. Lukács et al., 2016). Hence, it is crucial to understand what countermeasures are used and improve the test design to remove this vulnerability (e.g. Millen & Hancock, 2019; Orthey et al., 2018).
Another aspect that deserves more attention is the interaction of a CIT examination with other processes in the criminal investigation. For example, little is known about how a CIT examination affects a subsequent interview by police investigators, although the result of CIT examination can be used as a material to be submitted to a judge when the police request a warrant of arrest (Mitsui, 2004). On the one hand, the searching CIT could be an effective tool for investigators to extract previously unknown information. This information can be used to guide the investigators in their search for evidence and may facilitate the use of interview strategies that rely for example on the strategic disclosure of evidence (Granhag et al., 2016; Oleszkiewicz et al., 2017). On the other hand, it is also unknown if there are negative effects of exposure to correct information in a multiple-choice style format for suspects without knowledge of the crime. Even though the correct answer is ‘hidden’ among incorrect alternatives it is still presented. Consequently, a suspect may deduce which answer was correct due to the line of questioning in the subsequent police interview. Furthermore, it should be addressed in future research how the results of CIT examination can be used in court. While it can be admissible even in the Japanese criminal courts generally, they tend to have shown a suspicious and/or strict attitude in regard to credibility (Mitsui, 2004).
The CIT is a standard procedure in the Japanese criminal investigation. It serves two purposes: To confirm what knowledge a suspect has about the crime and in case the suspect is the true perpetrator to extract information about the crime to help the investigation. It has a strong scientific foundation, and has been validated in the laboratory and the field and is is a prime example of evidence-based techniques applied in police practice.
Japanese ‘lineup’ procedures (面割 - menwari)
Japanese practice
In Japan, the confirmatory identification of a possible suspect matching a witnesses’ person description is referred to as a ‘shashin-men-wari’ (face-comparison with photographs) procedure, analogous to the western ‘lineup’. Witnesses are presented with images of faces of individuals available to the police possibly fitting the person description and asked to identify the perpetrator (if present) from the lineup. It is used when the suspect is not personally known to the witness and there is no objective CCTV evidence to place a suspect at a scene. Around the world different countries regulate their creation and use of lineups in varying ways, being led by either procedure rules and guidance (for an overview see Fitzgerald et al., 2021) or by legal tests of how a face identification task was administered in retrospect, as is the case in Japan. A men-wari process involves a police officer presenting a series of possible alternative, but similar, faces to a witness and asking them to identify the suspect. Anecdotally, it is known that 10-15 facial photographs are used for these comparisons. The legal tests of the quality of a men-wari process may come from seven criteria set out in a Tokyo High Court ruling of 1985. This ruling established case law precedent for checks on witnesses in criminal investigations in general. This includes a focus on aspects of a men-wari procedure where opportunities for error or bias may be increased. It is worth noting that many of these legal checks are similar to the guidance used to govern identification procedures elsewhere in the world (Fitzgerald et al., 2021). All of the checks evaluate the sincerity and integrity of the witness. This includes the overall fidelity of the procedure, that the witness has the best intentions to be able to help the investigation. As a primary concern, the men-wari process may not be useful if a deceptive or unhelpful person is providing responses.
First, there must have been good witnessing conditions so that the witness must have been feasibly able to see the perpetrator. These environmental effects can have important impacts on a witness’ ability to encode a good quality memory trace and will constrain the accuracy of the outcomes of the procedure, even if there are no protocol concerns.
Second, the length of time between witnessing an event and conducting men-wari must be considered. As it is well known that the accuracy of memory degrades over time, it is important that the procedure is conducted in a timely manner after the event. Once a good person description or suspect individuals have been identified, the men-wari procedure should not be delayed so as not to risk more forgetting.
Third, there must be no witness inducement or improper guidance from the men-wari administrator. There may be occasions when administrators, whether intentionally or not, may lead witnesses towards particular members of the men-wari array. This legal test of the procedure considers the impacts this administrative effect might have on the integrity of the outcome.
Fourth, there must be an abundance of pictures to choose from. Whilst not specific to the number of alternative non-suspect pictures (or ‘foils’), this check considers that the matching must have enough breadth so as not to unintentionally draw attention to the suspect.
Fifth, the men-wari administrator must make it clear that the perpetrator may or may not be in the lineup. This instruction has been shown to be important in research (see below) and is a common feature in lineup guidelines around the world (Fitzgerald et al., 2021). This ensures that the witness does not feel like they are expected to respond to one of the presented faces. There are many reasons why the perpetrator may not be part of the assembled images, and the witness should feel empowered to be able to say that no one face matches their memory.
Sixth, if a suspect is identified by men-wari, the witness must then make a second confirmation of the identification on seeing the suspect. Photographs might not be the best match to the witness’ memory, and this second confirmation allows the memory to be checked against in vivo recognition. Further, witnesses may have chosen the ‘best match’ from the men-wari options, but this may not match seeing the chosen face in person.
Finally, it is desirable that the men-wari identification is conducted by multiple independent witnesses. Where this is possible, it shows that the conclusions are not drawn from one witness’ unique perspectives and their personal risks for biases or errors.
What we know from the literature
For over 30 years, psychologists have conducted a large range of research into the quality of eyewitness memory. Driven by early findings of the poor quality of memory for high stakes events, the literature focused on ways to improve the quality of investigative procedures. Much of this work is summarized by Wixted and Wells (2017) who propose five criteria of the ‘pristine lineup’ (photograph arrays of faces which match a person description), based on the academic literature on lineup administration. Firstly, lineups should only include one suspect. This avoids overly complicated lineup tasks where a witness is trying to match multiple memory traces to one array of faces. Surveys from across Europe indicate that multiple perpetrator person identification procedures pose challenges for practitioners when they investigate crimes (Hobson et al., 2012; Tupper et al., 2019). Relatedly, identification accuracy has been found to decrease when witnesses try to identify multiple perpetrators. (Bindemann et al., 2012). Keeping a lineup focused onto the single task of trying to identify one suspect decreases the burden on the witness.
Secondly, the suspect should not be unusually identifiable. This can be a particular challenge to practitioners, who have the difficult task of creating lineups of foils similar enough to the suspect, based on the provided person description, but not so similar that it is difficult to identify the target from a finite pool of faces in databases. Finding the right amount of (dis)similarity between the suspect and foils is a fine line as Fitzgerald, Oriet and Price (2015) demonstrated: lineups where non-suspects faces are highly similar to the target have an increased risk of falsely identifying an innocent member of the lineup.
A third feature of the pristine lineup is that the witness should be informed that the perpetrator may or may not be in the lineup. It is possible that the investigators can make a mistake and the perpetrator may not be in the lineup. However, the finite options in the lineup, combined with the stakes of the trying to do the ‘right’ thing, may lead witnesses to choose from the lineup. Research has considered ways to reduce this problem. Reminding people that the suspect may not be present has been one suggestion (Malpass & Devine, 1981) and has become guidance around the world (Fitzgerald et al., 2021). Research has also found that a ‘wildcard’ option, whereby there is a concrete ‘not present’ option is presented among the foils, is more effective in avoiding incorrect identifications in ‘perpetrator not present’ line-ups (Zajac & Karageorge, 2009). This has been found particularly appropriate for populations who are known to be more vulnerable to choose a foil incorrectly in perpetrator-absent lineups (such as children, Pozzulo & Lindsay, 1998).
A fourth element of ’pristine lineups is that lineups should be administered using a double-blind procedure to avoid the investigator providing any cues to the current suspect of the investigation (see Kovera & Evelo, 2017). A double-blind procedure is when the administrator of the lineup is unaware of case details or the target suspect in the lineup. This stops the administrator from implicitly or explicitly biasing the responses of the witness.
Finally, the pristine lineup also includes the requirement that witnesses should be asked to indicate their confidence of the identification (or not) that they make. Wixted and Wells (2017) conducted a review of the research concerning confidence and accuracy relationship for lineups and claim that confidence in a lineup judgement is a general indicator of accuracy, particularly where confidence is very high. These authors also state that confidence is particularly useful when all other elements of the pristine lineups are present, giving the most space for witness variability in confidence to emerge.
Some studies suggest confidence can be considered useful even in a non-pristine lineup process (Mickes et al., 2017; Wixted et al., 2018). However, the literature has also highlighted that it is important to consider how confidence is measured and tested (Juslin, et al., 1996). New research suggests that it may be worth consulting individuals about their own awareness of their person memory skills, as findings show that those with lower metacognitive belief in their eyewitness abilities have the weakest association between confidence and accuracy (Saraiva et al., 2020). Overall, confidence is important additional information to be added to the evaluation of the quality of witnesses’ contributions.
These features in a pristine lineup help best ensure that investigative errors are minimized and the quality of the lineup responses is suitable for legal proceedings. Whilst it has been noted that such best practice lineups do not always occur in everyday settings (Wade et al., 2018), in the context of the Japanese legal system, many of the core elements of the pristine lineups inform the legal checks of the men-wari procedure. The research-based evidence for pristine lineups suggest the men-wari legal checks of (i) a ‘suspect may or may not be present’ instruction; (ii) avoidance of improper guidance or induction; and (iii) abundance of adequate foils are all good checks to place on men-wari quality.
Outside the pristine lineup concept, the other men-wari legal checks reflect a variety of evidence-based risks to the lineup outcome. An effective way to consider the issues around lineups can be using the pioneering terminology framework by Wells (1978). This considers the psychology of lineup-like procedures as being divided into the psychology of System Variables; i.e. features of identification in the control of the investigation (e.g. the elements of a pristine lineup), and Estimator Variables; those influences on identification beyond the control of investigators, such as crime features and witness features. Men-wari checks on the integrity and competency of the witness and witnessing conditions of the event reflect a consideration of these Estimator Variables.
Some of the estimator variables that have been studied are focused on traits of witnesses. Firstly, it is known that individuals generally vary in their ability to recall previously seen faces. Contemporary research has shown that there is a continuum of face recognition ability spanning from atypically superior face recollection (individuals referred to as ‘super-recognisers’, Russell et al., 2009) to those with notably poor memory for faces (‘prosopagnosia’, Behrmann & Avidan, 2005) and these can have important implications for the quality of lineup evidence. Even beyond the extreme ends of the face recognition distribution, the general population can frequently make errors in recognizing previously seen individuals (Megreya & Burton, 2008). Even in studies where participants are presented simultaneously with two different photographs of the same person and asked if they match, errors are common (Burton et al., 2010). This is the case even in professional photograph matchers, such as passport officers (White et al., 2014). In general, considering any risks to evidential quality due to issues in face recognition is an important legal check.
The legal checks described above which may be used for evaluating good witnessing conditions are also well supported by the literature. Firstly, as one might expect, the physical distance between a witness and suspect has an impact on the quality of later identification. When suspects are observed further away there are more incorrect identifications (Lindsay et al., 2008; Nyman et al., 2019). Another intuitive finding is that increased duration of exposure to the target person increases likelihood of correctly choosing the suspect. In a summary of UK police cases, it was found that the witness was more likely to identify a perpetrator if they saw that person for more than a minute (Valentine et al., 2003). However, more than the physical conditions of witnessing the event (lighting, distance, etc) other Estimator Variables, such as the state of mind of a witness at the witnessing event, can have an impact on the lineup outcome. In general, memory encoding is poor during events of high stress (Kenneth A. Deffenbacher et al., 2004) and when assessing stress effects experimentally, it is found that physiological stress does decrease overall face memory accuracy (S. D. Davis et al., 2019). Further, Davis et al. (2019) found that witnesses’ confidence is also affected by the stressful experience of being a witness.
Other men-wari legal checks also include the question of the length of time between the event and the administration of the procedure. This is reliant on both estimator factors, such as when they come forward, and system factors, such as swiftness of procedure preparation. As Deffenbacher, et al. (2008) show, the natural process of forgetting over time (‘memory decay’, see above) also hinders lineup identification performance and memorial performance declines further over time. A prompt lineup procedure after a person description is thus key to good recall, and as such the men-wari legal check of a procedure’s promptness is vital.
One area where the literature is less clear concerns the appropriate number of foils in the lineups or men-wari procedure. The legal check in the Japanese system specifically refers to ‘an abundance’ of foils, but not a specific number (in other countries with protocol documents, lineup size may be specified by regulation, see Fitzgerald et al., 2021). In the research base, there is not a well-developed literature base concerning what is the ideal number of foils. Researchers have largely approached the question of lineup size from a probability perspective. For example, a 12-person lineup has a 1/12 chance of selecting the perpetrator by chance if the memory trace is poor, whereas a six-person lineup increases the odds of someone even with a poor memory of the perpetrators being correctly selected. Thus, the 12-person lineup is argued to be a fairer test of memory quality. Empirical work, such as a series of studies conducted by Kalmet (2016) found that the number of correct identifications from lineups were higher for the legal system-typical six-person as opposed to 24-person lineups. The upper limits of lineup size has been assessed in research by Levi (2012) who found that the likelihood of a mistaken choice from a lineup was less in 12-person lineups than 24 (or even 120-person lineups). Large lineups pose significant practicality challenges and there is not yet a broad empirical evidence base encouraging the use of larger lineups.
Overall, the academic literature identifies many critical effects on photograph identifications which are reflected in the legal checks that can be used for men-wari. procedures. There are uncontrollable features of witnesses and crime events, as well as systemic lineup administration effects that can impact lineup identification procedures. While there is wide and well-studied literature on lineups, there are many specific areas of practice that need more research. Not least the fact that there are key differences between typical laboratory research and the events surrounding real crimes. Flowe, et al. (2018) compared US felony cases to research studies and found that real crimes have longer exposure to culprits (seeing them for longer), longer waits until a lineup is administered, more weapons, more violence and live ‘showups’ (where lineup members are stood before witnesses, rather than presented as photos) were more common than photo array lineups. Further, there is limited research on the second confirmation element of the men-wari procedure and how, on seeing the suspect in person (‘men-doshi’) and not a photograph, might affect accuracy, confidence, and any signs of changing minds.
Future directions
As can be seen in the literature review above, many of the men-wari legal checks are similar to checks of literature-informed best practice for photograph identifications. Notably, this includes efforts to address System Variables, such as the instruction about the suspect ‘may or may not be present’ and the focus on absence of lineup administrator guidance (in line with the literature on double-blinding of lineups) and Estimator Variables, such as evaluating the environmental and contextual conditions in which the witnessed event occurred and the witness’ integrity. Many elements of the men-wari legal checks are checking consistency with what has been called the ‘pristine lineup’ (Wixted & Wells, 2017) which research has suggested allows the witness to be best able to reflect their memory trace in their decision.
Future research with the men-wari system may look similar to general other aims of the lineup literature. In particular given the legal check of an ‘abundance’ of foils, more work is needed to evaluate the impacts of men-wari/lineup size on the likelihood of making a correct decision. Having an understanding of how a number of foils may support or impair witnesses’ recall of a face is important for the research base.
Worldwide, there is a move towards making identification procedures more dynamic, with the use of video rather than classic photographs. For example, England and Wales is increasingly using video lineups that are short video clips of people moving to present a front and side on view of their shoulders-upwards. These procedures increase the amount of information to be observed in the identification procedure and thus might create more opportunity for idiosyncrasies in the target to emerge. That said, in a recent review, Fitzgerald, et al. (2018) note that “the empirical literature provides no compelling evidence in favour of either photo or video lineups” (p316) and Rubínová et al. (2021) find no superior effect for live or video lineups over traditional photo lineups. With new emergent technologies enabling witnesses to see more animated, dynamic, and perhaps ‘realistic’ presentations of lineup arrays, more research is needed to demonstrate if these new protocols are empirically more effective. This is equally the case for men-wari procedures.
Finally, it is worth noting that the literature on face memory procedures like men-wari and lineups is overwhelmingly based on Western legal system approaches. More research is needed from the Japanese perspective, evaluating men-wari procedures. Whilst many of the principles appear transferable, a research line that it is specifically exploring Japanese processes will enhance the knowledge base and enable the legal checks to be specifically legally supported.
Concluding remarks
We took a closer look at the distinct features of common criminal investigation procedures such as interviewing strategies and line-up identification processes, as well as procedures unique to the Japanese criminal justice system such as the detection of concealed crime knowledge. We note that Japanese practices often closely resemble techniques that are considered best practices in the western literature. Of course, not all best practices are in use now, but the same is true for Western police services, too. Finally, we demonstrate that each field features several avenues of future research that are ideal for establishing cross cultural collaborations.
Contributions
The introduction and conclusion is a collective work of all authors.
Conceptualization: LS, RO, JR
Section – Investigative interviewing: JR, DW, AK, RO
Section – Concealed Information Test: RO, IM
Section – Line up Identification: LS
Competing Interests
There is no conflict of interest.
Acknowledgements
This work was funded by the Research Councils UK grant ID: ES/S014055/1
Robin Orthey is an International Research Fellow of Japan Society for the Promotion of Science (Postdoctoral Fellowships for Research in Japan). ID: P21012
We thank Dr. Taeko Wachi for her helpful comments on an early draft of this manuscript.
Footnotes
It is crucial not to confuse the CIT with the typical “polygraph” applied in the west. This other test is known as the Control Question Test (CQT), which uses the same machine, but has a different methodology, implications and is rejected by the majority of the scientific community. Here, we only address the CIT.