zhikun-deng profile photo

As a user experience designer, I am dedicated to building a bridge between advanced technology and user experience to meet user needs.

Project overview

In the partially automated driving system, although the driver does not need to steer the vehicle actively, drivers must continuously monitor the road conditions and be ready to respond to unexpected accidents (SAE, 2018).

This project aims to explore the impact of active and calm conversational styles on keeping users alertness during partially automated driving. The results of this project propose design guidelines for designing conversational agents to protect driver safety and enhance the experience of interacting with conversational agents.

This study combines quantitative as well as qualitative analyses to explore the issue, confirming the soundness and reliability of previous findings and proposing new insights to provide more comprehensive, effective, and persuasive guidelines for the design of future conversational agents on automated driving system.

Prototype development

Define active & calm styles

The study defined two conversational styles (Active & Calm) according to the Five Factor Models (Goldberg, 1990; McCrae & Costa Jr, 1997) to design the corresponding prototypes.

High extraversion means that the conversational agent will behave more socially, enthusiastically, and proactively. It can initiate frequent conversations, use a positive tone, and interact with the user through humour and encouraging language. This design maintains the user’s attention and alertness through active interaction.

High openness means the conversational agent shows creativity and flexibility in its dialogue content and is willing to try new topics and communication methods to enrich the user’s chat experience.

High agreeableness allows the robot to show excellent friendliness and empathy in communication, enhancing the emotional connection of the interaction.

High neuroticism means that the bot behaves in a more calm and collected manner during interactions, providing stable support in complex or stressful driving situations and helping the user remain rational and focused.

Low neuroticism is reflected in the fact that robots may exhibit more emotional volatility, including showing richer emotional responses such as excitement and curiosity.

High conscientiousness is demonstrated by the chatbot’s high level of attention to the dialogue structure and task completion. It can provide precise information, strictly following established dialogue rules to ensure effective and accurate communication.

Low conscientiousness means that the bot is likely to be more focused on the fluidity and entertaining nature of the dialogue interactions rather than strictly following a set dialogue structure or task, which helps to create a more relaxed communication atmosphere.

Speech text design

Firstly, the study generated conversational content based on the three main social characteristics (Conversational intelligence, Social intelligence, and Personification) of conversational agents proposed in the existing literature (Chaves & Gerosa, 2020) to match two different styles. And then, an open source platform – ChatTTs was used to transform the text to conversational speech. The study used ChatTTs to control factors such as timbre, mood, speaking pauses, and laughter as a way to develop different conversational styles. Finally, through focus group, the study selected two samples to be used as sound sources for two conversational styles.

Testing video design

In order to measure participants’ reactivity, four clips from the Hazard perception test; a test used to assess a driver’s ability to recognise and respond to potential hazards on the road, will be selected as the key events for the test video in this study to assess participants’ s reaction time. Combined key events and driving videos from YouTube, the study finally made a 25 – minute video. The study then did pilot testings to verify the validity of the video, the results of Stanford Sleepiness Scale showed the testing video can make participants feel fatigue.

Study design

This study used a between-subjects design (BSD) to explore the effects of different conversational styles on driver fatigue. Each participant interacted with one prototype. This design can avoid learning or cumulative fatigue effects that participants may trigger due to multiple tests, thus increasing the reliability and validity of the results.

Source from: https://www.scribbr.com/methodology/between-subjects-design/

Testing process

Data analysis

Results of the final evaluation

Evaluating participant alertness

The Hazard Perception Test (HPT) video from the UK Driving Test will be used as the experimental material. The key events included in testing video are derived from the HPT, and participants’ reaction times will be converted into reaction scores on a scale of 0 to 5, with 5 being the fastest reaction and 0 being the failure to react in time, according to the scoring criteria of the HPT. By calculating the average of the participants’ reaction scores for the four critical events, the study will assess each participant’s level of alertness.

The study collected the participants‘ mental state scores before and after the test using Stanford Sleepiness Scale and calculated the difference between the two scores to assess the change in the participants’ fatigue level after the test, a positive difference will demonstrate that the participant felt more fatigued after the test. On the contrary, it will prove that the participant felt more refreshed and focused. This data will help the researcher understand the effect of two prototypes on the participants’ mental state during the test.

For maintaining driver alertness, results showed that Prototype A had a greater reactivity score than Prototype B. And the score of SSS for Prototype A was slightly lower than that for prototype B. However, these two data did not show statistical differences. It may be influenced by individual variability, but the results of the descriptive analysis of two data showed that Active style was much more helpful in maintaining driver alertness.

Reactivity score

Stanford Sleepiness Scale

Evaluating participant experience

For the PANAS, the researcher calculated the sum of the scores of the 10 positive and 10 negative emotion items in the questionnaire. The higher the positive emotion score, the more positive emotional experience the prototype gives to the participant, and similarly, the higher the negative emotion score, the more negative emotional experience the prototype provides to the participant, PANAS demonstrates the positive and negative impact of two prototypes on the participant’s emotions and thus assesses the user experience of interacting with the prototype.

On the PANAS—Positive score, the mean score of Prototype A was 36.73, significantly higher than that of Prototype B, which was 31.91. The difference reached the statistically significant level of a considerable effect (r=1.05>0.8), suggesting that compared to the calm style, the active style perform well in improving drivers‘ positive emotions, thus enhancing their sense of driving experience.

As for the PANAS negative emotion scores, the mean score of the Prototype A was slightly higher than that of the Prototype B (14.91 vs. 14.27), but this difference did not reach a statistically significant level. In addition, box plot analyses showed that outliers were present in both groups on the test, with more outliers in the Prototype B group, indicating significant individual differences in participants’ responses to negative emotions. Although the difference in the PANAS-Negative data did not reach a statistically significant level, it reached a statistically significant level in the PANAS-Positive data, so it can be reasonably analysed that the active robot gives users a better interaction experience.

PANAS – Positive

PANAS – Negative

Qualitative results

The study first coded the transcripts according to the two research questions, then categorised the relevant codes according to three social characteristics of conversational agents design and participants feelings, and finally grouped them into three themes – Interactive, User Experience and Driver Behaviours.

Qualitative results showed that there was individual variability in the impact of conversational style on the driving experience. Compared to the calm conversational style, most participants perceived active style as fun and energetic and were more willing to interact with them. However, a few participants, due to their personalities and driving habits, perceived the active agents as noisy and unsettling, and preferred the quiet, calm style.

Regarding the conversational agents’ anthropomorphic performance, participants wanted the agents to be highly anthropomorphic whilst retaining its robotic features to help them recognise that the person they were communicating with was a robot, which would enhance their interaction experience whilst avoiding the emotional stress associated with an overly anthropomorphic conversational agents, and thus improve their concentration on-road situations.

Implications of the results