Internet-based communication technologies can reduce both carbon emissions and financial costs of academic conferences for individual participants—especially those from non-rich countries. We consider currently available technological solutions and logistic formats. In July 2018, we organized the leading international conference on music cognition (ICMPC15/ESCOM10) as a multiple-location, semi-virtual event with hubs on four continents (Europe, North America, South America, Australia) and 600 active participants. Every talk was live-streamed to YouTube (unlisted with URLs accessible only to registered participants) and seen by two audiences (local, remote). Remote presentations were either real-time or delayed. Discussions were two-way audiovisual (Zoom). Student assistants managed the technology. The 24-hour program ran for five days, with normal working hours at each hub. Most (61%) participants approved of the semi-virtual format. Greenhouse-gas emissions per participant were reduced by 60–70% relative to an equivalent single-location conference. No talk was delayed or canceled for technical reasons. A semi-virtual, multiple-location approach improves the globality, cultural diversity, and accessibility of academic conferences. That in turn improves the relevance and long-term quality of academic content. In future, emissions and international time-difference problems can be further reduced by increasing the number of hubs. Every academic conference, regardless of size, discipline, or country, can benefit from live-streaming some or all presentations. Conference participants and organizers can contribute to global mitigation efforts while at the same time promoting their academic disciplines by taking advantage of modern internet communication technologies. Video recordings of talks contribute to documentation and dissemination.
Introduction
An academic conference produces greenhouse-gas (GHG) emissions in various ways. Emissions from flying are typically large (measured in tonnes of carbon or CO2) relative to emissions from other sources (measured in kilograms). The main environmental challenge, therefore, is to reduce emissions from flying.
An economy seat on a typical intercontinental return flight corresponds to roughly one tonne of carbon equivalent or 3.7 tonnes of CO2 (co2.myclimate.org). Expressed per person-kilometer, these emissions may be some four times higher for flying than for bus travel, and twenty times higher than for electric train (eea.europa.eu/transport). By comparison, air conditioning at a conference in a hot, humid location might use some tens of kW for some tens of hours, or roughly 1000 kWh. If the electricity is from fossil fuels, generating one kW produces about one kg of CO2 by burning 400 g carbon. So a few hundred kg of carbon will be needed for air conditioning—less than the flying emissions of one participant. The CO2 emitted during production of beef served at the conference are even smaller; for every 10 kg beef served, roughly 100 kg carbon is burned. Emissions from a kg of plastic packaging are roughly 2 kg carbon.
Consider now the emissions from internet-based audiovisual (AV) communication. YouTube (currently the most popular platform) is emitting approximately 10 million tonnes of CO2 equivalent per year, while being watched for 1 billion hours per day (Preist et al., (2019). If on one day 27,000 tonnes CO2 equivalent are emitted while videos are watched for 1 billion hours, one video produces 27 g CO2 per hour, or about 7 g carbon. At a semi-virtual multi-hub conference (described below), 500 participants might watch YouTube for 2 hours each. If we add virtual presentations within the conference program and viewing by individuals after the conference, the total might be 3000 hours or 20 kg carbon—negligible compared to emissions from flying.
Greenhouse-gas (GHG) emissions from aviation currently account for about 3% of global CO2 emissions. Due to other gases (e.g., NOx and CH4) and particles emitted, and their complex interactions with the atmosphere, the contribution of aviation to anthropogenic global warming is very roughly twice that of the CO2 alone. Global emissions from aviation are increasing by some 4% per year (5% more person-km less 1% gain in efficiency) with no end in sight. Alternatives such as electric motors and biofuels cannot scale up without causing other serious environmental problems. Flying may represent half the footprint of those relatively few people who currently fly (ICAO, 2016, 2019; Freeman et al., 2018; Graver et al., 2019; Owen et al., 2010).
In recent years, technological developments in internet-based AV communication have made it possible to significantly reduce GHG emissions produced by travel to and from academic conferences by incorporating virtual interactions. This strategy can have positive spinoffs for academic communication, collaboration, inclusion, cultural diversity, and dissemination. The available technology for high-quality AV transmission is reliable, inexpensive, and easy to use, provided organizers are well informed in advance and have qualified technical support (e.g. local students in audio engineering or related fields).
This practice paper outlines promising alternatives to the conventional conference format that involve live streaming, and provides guidance for successful implementation. Live streaming can be used in different ways to improve the accessibility and cultural diversity of an academic conference while at the same time reducing emissions. Since YouTube made live streams freely available in 2013, recorded videos have become an important means of academic communication and documentation; impressive examples within YouTube include “Cambridge University Press – Academic” and “Oxford Academic (Oxford University Press)”. Conferences can be split across several global locations, allowing face-to-face social interaction within hubs and virtual social interaction among hubs (“semi-virtual”).
Live streaming is not the only way to reduce emissions. Another option is to encourage colleagues to use surface transport wherever possible. For those that fly, emissions can be saved by avoiding more than one take-off (combining flying with train or bus) and informing travel agents of this constraint (Astudillo & AzariJafari, 2018).
A further option is to avoid flying altogether. A “Nearly Carbon Neutral (NCN) Conference Model” has been developed and implemented by Ken Hiltner, professor of environmental humanities at the University of California at Santa Barbara. A consideration of zero-location formats is beyond our present scope.
In this contribution, we aim to help colleagues in different disciplines and countries to significantly reduce the carbon footprint of their conferences by explaining how they might take advantage of appropriate internet-based communication technologies. We discuss how an approach of this kind can help academic conferences (and perhaps conferences in general) reduce their dependence on air travel while at the same time improving inclusiveness.
ICMPC15/ESCOM10
The recommendations in this paper are based on our experiences while organizing an innovative multi-location semi-virtual conference. The 15th International Conference on Music Perception and Cognition (ICMPC15), combined with the 10th triennial conference of the European Society for the Cognitive Sciences of Music (ESCOM10), happened simultaneously in four countries (Argentina, Canada, Austria, and Australia) and lasted from 23 to 28 July 2018. The research area of the conference was music cognition, but the semi-virtual idea could be realized in any academic discipline in sciences, engineering, humanities, or arts, whether that discipline be specialist, interdisciplinary, pure research, or practically oriented.
We aimed to halve per-capita emissions per participant. We achieved an even greater reduction, while at the same time making other improvements. By flexibly incorporating new communication technologies, we were able to increase both the number of participants and their cultural diversity. For many participants, the total cost of participation was significantly reduced. The cost reduction was greatest for colleagues from non-rich countries, for whom all three costs (registration, accommodation, and travel) were significantly reduced by comparison to flying to a conference in a rich country. At our Argentinian hub, South American participants paid less than half the registration fee charged in Canada, Australia, or Austria. They also paid less for accommodation, given the lower cost of living in Argentina, and traveled a shorter distance than to a single-location conference in Graz, Austria. It would have been relatively easily within our new conference format to magnify this effect by establishing additional hubs in low-GDP countries such as South Africa or India.
All talks at all hubs were live-streamed. At each hub, local and virtual programs ran in parallel in adjacent rooms; participants could easily change rooms after each talk, and in that way experience a mixture of live and remote presentations. The global program ran around the clock, whereas local programs were confined to usual working hours.
Conference format options
The potential of live streams for academic conferences is only starting to be realized. Every conference, large and small, in every discipline and in every country, can benefit. Live streams enable any talk to be shared with a larger audience. The information becomes more openly accessible. The geographic outreach and cultural diversity of presenters and audiences is increased (cf. Neustaedter et al., 2018). A talk can be given almost anywhere, opening up new possibilities for global academic exchange. Ultimately, increasing the diversity of academic documentation formats helps both academic and general audiences to understand the content.
The incorporation of live streams changes the conference experience. The added variety of content, presentation format, and interaction style makes the conference more interesting. Many academic colleagues will have experienced poor live streams at conferences, but that was often due to solvable technical problems. The AV quality of YouTube live streams is consistently and reliably high. Technical problems can be avoided by careful advance rehearsal.
While it is undeniably more fun and often more productive to communicate with people in person, face to face, it is also true that a regular live conference can be improved by adding electronic communication with remote participants. Colleagues can be included who could not have flown to a central conference location; reasons for not flying can be financial, family-related (caring commitments), physical (disability), and political (passport, visa). Talks can be viewed at any time, which is both an advantage (allowing any participant to watch any talk, even if the live talks happen simultaneously) and a disadvantage (breaking up the communal experience of watching and discussing a talk together).
In future, total emissions from flying to semi-virtual conferences could approach zero if conference hubs were located such that few or no colleagues needed to fly. Such conferences would also be more accessible, especially for colleagues in non-rich countries, who would no longer be faced with impossibly high costs for a long flight, registration, and accommodation in a rich country. Participants would be treated equally if hub sizes were about equal, all talks were given live at hubs, and participants were neither implicitly rewarded for flying nor penalized for not flying. In the following, we outline various technological/logistic format options.
Adding remote talks to a regular conference as one-way streams
Including some remote presentations is the simplest option for many conference organizers, requiring no additional equipment (hardware or software). Any conventional conference can be adapted to include remote presentations, which reduces the number of local participants, while increasing the total number. The event loses some of its elitist jet-set character and becomes more open: any researcher in the world can participate. Environmentally aware participants are no longer under pressure to fly in exchange for academic career benefits. The program becomes more interesting and diverse.
Remote presenters can set up one- or two-way connections using facilities and support at their local institutions. Setting up a one-way stream (e.g., YouTube) is more difficult for the remote presenter than for the conference organizers. Organizers need no equipment beyond that for regular teaching: a regular computer with cable internet connection and internet browser, and a data projector. Below, we will consider an alternative two-way option (e.g., Zoom) that conversely means more effort for the organizers but less for the presenters.
At the time of writing, YouTube live streams are the most promising option for one-way streaming. The platform is reliable and the cost to the user is zero. AV quality is consistently high; transmission quality is almost independent of internet stability. However, the high AV quality comes at the cost of a buffering delay of some 20–60 seconds. Other conference participants can watch these remote talks either in real time or later, which can partially compensate for the social disadvantage of remote presentation. Personal communication with remote presenters can happen at a different time in a quiet room (“global foyer”) or during special sessions (which we called “virtual socializing” at the Conference on Interdisciplinary Musicology, Graz, Austria, 26–28 September 2019).
YouTube offers three access options:
Public videos, accessible to everyone and findable in Google. This is the most familiar option.
Unlisted videos that can be viewed only if you have the link. That is what we used.
Private videos that can be viewed only on invitation and after logging in with a google account. The number of people that can be invited is limited to 50 (in October 2018).
Many academics who present papers at conferences do not want their talk to be published in the internet. They may be worried about making a mistake that cannot be corrected later, or publishing sensitive data. For that reason, options 2 and 3 may currently be preferable for academic conferences. Option 3 is realistic only if there is no individual access, but the remote stream is only shown in the auditorium or for smaller conferences, so we focus here on 2. Participants can be asked to sign an agreement to keep the links confidential (at our conference we indicated that this was voluntary, but all participants signed). Remote presenters can set up their stream (including starting time and URL) in advance. Organizers can then distribute the URLs to all participants in an email just before the conference.
Anyone with a mobile phone can send a YouTube live stream from a remote location. To ensure a good presentation, conference organizers can insist that speakers use a dedicated room at a university or similar institution with technical support (a person with whom your technician can communicate). Organizers can also insist on the presence of a small local expert audience (e.g. some PhD students) at the remote location. This creates a more comfortable situation for the speaker due to the natural visual and auditory feedback from the local audience during the presentation. If speakers cannot comply with these conditions, organizers can negotiate with them.
Questions and discussion following talks are important, and there are various options. One is to switch from one-way to two-way communication, as explained below. Another is to remain within one-way communication. For talks that are transmitted as YouTube live streams, audience members can type a question into a laptop or mobile phone. The remote speaker can then answer the question acoustically on the live stream. An advantage for both presenter and audience is that all questions and comments are documented. Examples of public YouTube live streams that are currently running and publicly available can be seen by visiting youtube.com/live. Next to the moving image is a chat stream. In an academic conference, audience members with a Google account can use the chat stream to comment or ask questions at any time. Another option is for the speaker to provide a mobile phone number. Audience members can send their questions by SMS, WhatsApp, Signal, or other messenging service and the speaker can answer acoustically. The chair can even make a regular phone call to the speaker at the end of the talk and pass the phone around the auditorium as audience members ask their questions. The speaker answers on the live stream. This last option does not work well on YouTube live stream due to the time delay. A further advantage of this approach is that conference organizers do not need to set up wireless microphones. This method can also be used as a backup in case microphones do not work or the sound quality on Skype or Zoom is poor.
Adding remote talks to a conference changes the conference budget. The total number of talks may go up at the same time as the number of local participants falls. This reduces both registration income and local costs, but not in the same proportion. Consider the following strategy:
Increase the registration fee slightly to reflect the larger scope and diversity of the conference, made possible by adding remote presentations.
Limit the number of remote presentations to a certain proportion of the total program (say, 1/3).
If more than 1/3 of participants want to present remotely, accept applications on the basis of geographic distance and/or reviewers’ grades.
Charge remote presenters from rich countries a small fee; presenters from non-rich countries whose abstracts have been accepted by anonymous peer review may present for free (although they may be charged a smaller fee to access other live streams).
While the option of remote presentation should normally be explained in the call for papers, it is also possible to introduce it for the first time when papers are accepted. Organizers can then inform participants that they can present remotely, give them a technical guideline, and ask for their decision by a deadline. On the assumption that all remote talks will be sent from dedicated rooms at other universities, the main target readers of the technical guidelines will be the technical support staff at the presenter’s university. The guideline might explain how to install Open Broadcaster Software (OBS) and use that to stream to YouTube. OBS juxtaposes the talking head of the speaker with the Powerpoint image (see appendix).
To avoid legal problems, presenters should be asked to avoid including soundfiles or images from the internet. YouTube can automatically find potential copyright infringements and block the stream. To our knowledge and in our experience, that is the only time YouTube is likely to fail.
Live streaming all talks
More ambitious conference organizers may decide to live-stream every talk. For this, additional equipment is necessary. Conference rooms need regular teaching equipment plus wireless microphones, a sound mixer, and webcams for speaker and audience. All talks can then be live-streamed to YouTube as unlisted videos, so anyone in the world who knows the URL of a given stream can watch it either in real time or later.
URLs can be provided to participants in different ways. One option is a password-protected system (e.g. Moodle), with a separate page for each talk. These pages can contain abstracts and any other materials that presenters wish to upload such as proceedings, sound examples or other videos. They can also include discussion forums where participants could ask questions and speakers could answer them.
For a smaller conference, a password-protected system is not necessary. Instead, organizers can set up the YouTube streams in advance (to start automatically at set times) and send the URLs to all participants by email. As further protection against unauthorized use, it is possible to download all internet videos on a given date following the conference, delete them from the internet, and keep them in a private archive.
Adding two-way remote presentations
Remote presentations can be improved by making them (or the discussions that follow them) two-way. A two-way presentation is one in which people at two locations can talk back and forth with little or no perceptible time delay.
For this to work, both presenter and remote audience must be able to rely on a fast and stable internet connection. To check whether the connection is sufficient, the average netspeed per country can serve as a first indicator. According to Wikipedia “List of countries by internet speed” (accessed on 4 March 2019), Austria’s speed is 14.1 Mbit/s according to Akamai Technologies 2017, whereas for example Australia’s speed is 11.1 Mbit/s. Note that internet speeds vary considerably within countries and available speed is constantly increasing. Especially academic institutions may be well above average. During ICMPC in 2018, the netspeed at the University of Graz was typically 60 (download) to 90 (upload) Mbit/s. In mid 2019, one Gbit/s was available in most rooms. Online speedtests (e.g. speedtest.net) can be used to assess this parameter. Another important parameter for the viewing experience is the packet-loss rate (Hestnes et al., 2003). For these reasons, we recommend conducting test sessions in the exact rooms used for streaming.
Having tested various two-way AV communication options, such as WebEx, Google Hangout, Jitsi, Blue Jeans and Skype, we chose Zoom. We found Zoom relatively high in AV quality and easy to use and flexible in our specific case where one person is giving a talk using Powerpoint or any other presentation program, and both the talking head and the image of the remote audience can be seen next to the ppt slide. Also, it is possible to share the presenter’s computer sound directly. To increase flexibility, we purchased an inexpensive upgrade to Zoom Pro to enable talks from speakers at more than one location simultaneously. In a business context, Zoom was recommended by Fasciani et al. (2018).
For conference organizers, the process begins by sending an email with a Zoom link to the presenter to organize a short rehearsal. To conduct a comfortable question session with the local audience, use a webcam or inbuilt camera (so the remote presenter can see your audience), some wireless microphones, and a small audio interface or sound mixer with built in interface.
During the talk, the presentation will be shown in Zoom via screen-share. The camera picture will appear in the corner automatically. On the receiving end, the voice can be electronically amplified. After the talk, questions can be asked by the audience using a wireless microphone, which is connected to the computer via the audio interface. Consider the following points:
The presenter should use a headset (headphones with microphone near the mouth). A headset prevents acoustic feedback from the loop “Microphone (sending end) – PA (receiving end) – wireless microphone (receiving end) – loudspeaker (sending end)”.
Do not mute your audience microphone during the talk. The presenter should be able to see and hear the audience, which makes the situation feel more natural.
The audience should see themselves (how they appear to the presenter) in a separate window.
The wireless microphone and the remote acoustic signal should be mixed and connected to the same amplifier. During question sessions, audience members should hear their voice through the loudspeaker. This feels natural and tells them that the presenter can hear them.
If the presenter wants to play sound from her/his computer during the talk, test it in advance. The presenter should send you the soundfiles separately in advance so that if sound quality is poor you can play them locally.
Test the setup a few days before the talk. Establish contact with the presenter 15 minutes before the talk starts.
The multi-location semi-virtual conference
A multi-location semi-virtual conference incorporates all the above features. In addition to remote talks by individual presenters, the conference itself is spilt across several global locations, connected by one- and two-way streaming. ICMPC15/ESCOM10 took place simultaneously on four continents (Australia, South America, North America, and Europe) and the hub locations were Sydney, La Plata (Argentina), Montréal (Canada), and Graz (Austria). Many more hubs would have been possible using the same technology. On the basis of our practical experience organizing this conference, we recommend the following for future semi-virtual conferences.
Review procedure
To ensure a common academic standard, we carried out a thorough peer review procedure for submitted abstracts at one of the hubs (Graz) using ConfTool (for which we paid a small fee). Other hub organizers did not have to bother with abstract review, which made it easier for us to recruit hub organizers. Although Graz carried out the review procedure, Graz did not have special status on the conference program; to treat all participants equally, all hubs were nominally equal.
Number of hubs
Consider a multi-hub conference, in which each hub is nominally equal, independently offers its own local program from accepted abstracts, and independently decides what talks or sessions from other hubs to include in its virtual program. In that case, there could be ten or more hubs at different global locations. The greater the number of hubs, the greater is the accessibility and cultural diversity of the conference, the smaller are the GHG emissions per participant, and the easier it is for individuals to organize hubs (because they are smaller). A large number of smaller hubs is possible and practical if each hub works relatively independently according to a central guideline and makes its own independent programing decisions. The organizer of each hub should have a relevant PhD (for an academic conference) and take advantage of the equipment available in regular local teaching rooms.
Rooms
Each hub needs two or more presentation rooms: one or more sending rooms (also called live rooms or streaming-out rooms) for regular talks and one or more receiving rooms (virtual rooms, streaming-in rooms) for virtual talks. These terms refer primarily to one-way communication (YouTube); during two-way discussions (Zoom), sending rooms are also receiving, and receiving rooms are also sending. Since many institutions are now charging large amounts of rent for lecture theatres, organizers can save money by using regular teaching rooms instead, which becomes more feasible if the conference is split across a larger number of smaller hubs.
Program and timing
No matter where a hub is located in the world, it can communicate in real time with almost any other global location if the local daily program is divided into two half-days of four hours, separated by a relatively long lunchbreak or siesta. In the morning at each hub, conference participants will be communicating internationally toward the East; in the afternoon or evening, toward the West. For information about changing time differences due to daylight saving (summer time) see timeanddate.com.
Figure 1 is a sketch of one day of ICMPC15/ESCOM10. The top row (UTC or GMT) is time relative to UK time in winter (our conference was in the Northern summer, for which clocks had been put forward one hour). The red block in the middle of the figure represents a period of four hours during which three of the four hubs worked together. In local time, this block started in Montreal at 9 am, in La Plata at 10 am, and in Graz at 3 pm (15).
Each hub of our conference had a regular local program that included both live and virtual talks and was printed on paper. As at a conventional conference, we delivered this document to a printer a week before the conference. All participants also had access to an electronic 24-hour global electronic program with times in UTC (GMT). The global program gave an overview of all regular talks at all hubs. Each talk could be seen remotely at one or more other hubs (central organization was necessary to ensure that). During parallel sessions, participants could choose between local and virtual talks (“semi-virtual”). We recommended that participants switch back and forth to balance local and global content and experience a varied and dynamic program. Each hub offered one keynote, during which nothing else was scheduled at any hub.
For the future, we recommend that all events start on the hour or the half hour, around the clock. Every two hours, there should be a globally coordinated half-hour break (which can overlap with a poster session or pre-conference activity, see below). Breaks might start at UTC 000, 200, 400 and so on. The 90-minute blocks between these breaks would then begin at UTC 030, 230, 430 and follow standard patterns such as:
three regular talks,
a keynote (1 hour) followed by invited responses (30 min),
a poster session (60 minutes) preceded by speed presentations (30 minutes),
workshops and demonstrations in parallel, or
an opening session followed by discussion groups.
The daily program at any location would then be divided into two 3.5-hour blocks (90 minutes work + 30 minutes break + 90 minutes work), separated by a long lunch break (siesta). Depending on location, the local-time plan might be one of the following:
8:30 am to 12 noon and 4:30 pm to 8 pm
9:30 am to 1 pm and 5:30 pm to 9 pm
A shorter lunch break is possible, depending on where the hubs are located. Organizers can explore what is gained or lost at each hub when the break at a given hub is made longer or shorter.
At ICMPC15/ESCOM10, we divided talks into long (30-minute slot) and short (20 minutes) based on reviewers’ grades. In retrospect that made the task of creating a thematically coherent program, with sessions focusing on given issues or areas, too difficult. We now recommend choosing a single basic time unit in advance. A 20-minute basic unit would mean breaks of only 20 minutes every two hours, which would be more stressful for participants. Shorter breaks also increase the chance of delays or technical problems (technicians have to check a list of points before the start of each session; see appendix). Therefore, 30-minute units are preferable.
The 30-minute “break” just before the start and after the end of each half-day can be used in many ways: concert, warm-up activity, demonstration, installation, discussion. Participants who attend such activities will then be seated in time for the start of the following 90-minute work periods. That is important, given the importance of avoiding delays at this kind of conference (see timekeeping).
Technology
In the following, we explain in more detail the technological solutions that we adopted at ICMPC15/ESCOM10. They consist of the same components suggested for streaming and adding remote presentations above. Colleagues in different disciplines and countries can imitate us or adapt our approach for their purposes. We do not wish to specify an exact solution, because different conference traditions have different priorities. Moreover, the technology is constantly improving, so parts of this guideline will quickly become obsolete. The technology that we used was not exactly the same at each hub, depending on differences in available hardware. We will mention some of the differences in this document, but for simplicity we will focus on the similarities.
A 30-minute program slot at an academic conference is often divided into three parts:
20 minutes for the talk itself
7 minutes for questions and discussion
3 minutes for changing rooms
For each of these, we used different software solutions, combining one-way and two-way streaming.
All talks were streamed as unlisted YouTube Videos.
Discussions between presenter, remote and live audience were conducted in Zoom and streamed to YouTube when it was live.
When a talk was viewed at another hub after a time delay, the audience contributed comments in writing, either in YouTube or in Moodle.
A small number of talks were given as two-way remote from individual presenters who could not attend one of the hubs in person.
Room-changing was facilitated by a time-keeping webpage (TACT).
Hardware
The easiest hardware solution for streaming out is a regular laptop with internet connection (preferably wired or over a hidden network) and a built-in camera and microphone. The laptop can be placed on the presenter’s podium (lectern, pulpit), as shown in Figure 2. Install OBS software on the laptop and use it to mix the talking head of the presenter with the Powerpoint screen. After that, send the mixed signal to YouTube or Zoom. Another way to encode the video signal before sending to YouTube is to use a hardware encoder. At our conference, that option was available at two of the hubs (Graz, Sydney).
The entire procedure can happen on a single computer, but in practice it works better on two, so the presenter and technician have separate screens, as shown in Figure 2. The presenter’s computer runs Powerpoint and the technician’s computer takes care of the streaming. By “presenter’s computer” we mean a laptop that organizers make available to presenters; presenters should not use their own laptops. The output of the presenter’s computer is connected to a hardware device (HDMI video grabber, cost: roughly $100), which appears as a webcam on the technician’s computer, as shown in Figure 2. A functional diagram of these connections is shown in Figure 3.
The two-way communication software Zoom can run either on the tech computer or on a third computer (cf. Audience Computer in Figure 2). Switching between talk and discussion involves changing the input to the stream and to the local projector. Two-way communication can be improved by including a video picture of the audience (giving a feeling of presence) as well as the picture of the presenter’s head and the Powerpoint slides. This solution can be realized either with the speaker’s computer also joining the Zoom meeting or with a camera operated by an assistant.
An external camera and microphone can be used for the presenter’s face and voice. The camera should be at the height of the presenter’s head. The microphone can be on a stand positioned closer and lower than the camera. Ideally, one may use a lavalier or headworn microphone. The camera and microphone should be placed such that the presenter is not distracted by them, so the situation feels as natural as possible. It is important to test this by having a colleague give a regular research presentation with a regular audience and talking to them about it later. Note that lighting is important to ensure that the presenter’s face is clearly visible. Test the lighting both during the day and in the evening.
A single external microphone can be used for audience members asking questions. Things will move faster in the question session if there are 2–3 wireless microphones: one for the chair and 1–2 for the audience (the speaker already having one). For that, an audio mixer is needed, and cheap ones are available. It is crucial that questions and answers can be heard at remote hubs. We asked all participants to hold the microphone close to the mouth, explaining that amplification levels need to be kept low to avoid acoustic feedback, when two-way communication is involved.
Backups
Technical problems can delay the start of a talk. At ICMPC15/ESCOM10, we avoided delays by backing up every channel of communication in real time: Zoom acted as a backup for YouTube and vice-versa. In other words, we ran YouTube and Zoom simultaneously in all talks. The technician at the front of the room switched from Zoom to YouTube at the start of each talk and back from YouTube to Zoom at the start of the question period, without turning off either stream. We started the YouTube stream during the break before a session and maintained it throughout the session. Afterwards, each YouTube file contained 2–4 talks, which could later be separated by editing.
We also backed up videos for later viewing. If the encoder (hardware or software) permits a local recording during the stream, this can be used to make a backup. If two-way communication fails, Jitsi can be used as a backup. The advantage of Jitsi is that a meeting can be started immediately without login in and with a customizable URL. If this URL is communicated, the other party can join immediately, again without logging in. But the AV quality of Jitsi is not always satisfactory.
In addition, a regular cellphone can be connected to a long cable. If the audio quality of two-way communication software is insufficient, participants can use this phone to ask questions. At our conference, this never became necessary, but it was reassuring to know that it was available. A telephone connection is very reliable, but the audio quality is worse than online communication services.
Requirements for presenters
Presenters may be asked to do the following:
Prepare Powerpoint files in 4:3 format (not widescreen 16:9), leaving room for the talking head without it up covering part of the slides.
Bring all materials on two USB flash drives (one as a backup). On each, all relevant files should be in the same folder. Due to the complexity of the streaming setup, we did not allow presentations from private laptops, although it is theoretically possible. Those who want to show their screen or desktop while interacting with laptop software are asked to make a video in advance.
Be careful to avoid showing any image or video or playing any sound track that may be copyrighted. YouTube may detect copyright infringements automatically and close the live stream.
Timekeeping
If the conference program includes parallel talks at different locations, it is important to ensure that they start exactly on time. We asked session chairs to start each talk within ten seconds of the advertised time. We informed presenters in advance of the importance of exact timing and clarified that in an internationally coordinated conference it is not possible to continue speaking after the programmed stopping time.
To achieve this, we created a timing tool. The TACT website was developed by Hannes Karlbauer in collaboration with the first author. TACT stands for Tonal Academic Conference Timekeeper. The abbreviation reminds us that it is necessary to tactfully remind presenters when their time is up.
TACT is an internet page that shows the time in UTC as well as the local time at each conference hub. Toward the end of each conference timeslot, it plays music to ensure that the discussion following the talk stops on time.
Three minutes before the end of the timeslot, the words “Time’s up” start to flash silently.
Two minutes before, music starts, gradually getting louder for about ten seconds. We created a library of non-copyright music for this purpose. Creating such a library is an interesting task by itself. Colleagues with access to composition students can for example ask them to send short sound clips in reply for announcing them in the program. On request we may be able to provide all or part of our library.
One minute before the start of the next talk, the music stops. Presenters and audiences quickly got used to this implicit signal: silence means stop talking and sit down, as the next talk is about to start.
Throughout the conference, TACT ran on a separate computer in each room with an external loudspeaker. For this, we had no additional equipment costs because old laptops and loudspeakers (cheap PC monitors) were available from our IT department or privately.
TACT did not inform the presenter about the number of minutes to go before the end of the talk and the start of the discussion. Instead, student assistants held up signs with “5 minutes to go”, “3…”, “1…”, and “Time’s up!”
Technical assistants
We had one technical assistant and one non-technical assistant (two each for plenary keynotes) in each sending room, in addition to the chair and the presenter. The technical assistants were studying audio engineering or similar and were coordinated by a head technician (the second author). Remuneration was by contract, course credits, or both.
The technical assistants trained for about two days before the conference began. Training included getting to know the setup and procedures, rehearsing communications with conference presenters, and showing non-technical assistants the basics. They were given access rights (passwords) and technical guidelines.
During the talks, the technical assistant sat at the front of the room next to the presenter and chair. One or more non-technical assistants were in the audience and passed around the microphone(s) during the discussion. See the appendix for a checklist for setting up and monitoring a talk/discussion.
Telephone communication for technicians
Technicians at different locations need to be able to communicate easily, independently of the conference streaming system. They need to say things like: “Everything fine from your end?”, “Please turn up your microphone!”, “Are you ready for a question from your hub?” or “We can’t hear you!” Speaking quietly on the phone can be quicker than writing when critical situations occur.
One option is to use an instant messenger on the technicians’ private phones. While this might seem obvious and easy, one has to make sure beforehand that everyone knows which number to write to for which room at which time. Also, everyone needs to agree to one application. This can be difficult since some find WhatsApp problematic for security reasons (or cannot use it due to older operating systems) and in some countries open source apps like Signal are not supported. If there is one phone reserved for every room, technicians know which number to contact. At our conference, a lot of technical communication was done in the chat window of the two-way software, but this is not ideal, since during the discussion it may be visible to participants. We gave a lot of thought to setting up these various channels of communication in advance. In retrospect, that was one of the main reasons we managed to avoid technical delays.
Global foyer
Participants at all the four hubs of ICMPC15/ESCOM10 could electronically meet and virtually socialize with colleagues from other hubs, either spontaneously or at planned meetings. Breaks were timed to make this possible at different locations. Each hub had a quiet room called “global foyer” near the coffee area. It gave remote presenters a feeling of participation and local participants the opportunity to communicate easily and informally with them.
Each global foyer had a number of computers, each with a (built-in) webcam, a headphone amplifier, and a USB microphone. To avoid background noise, there were acoustically absorbent walls between the computers. Often, up to three people could sit at each computer and talk to up to three people at the remote location. People spoke into one central microphone but wore separate headsets.
A small 4-channel headphone amp and a USB microphone cost less than $60. The microphone had a cardioid pickup-pattern. Cheap headphones were provided and participants could also use their own. In terms of software, one can use Skype or similar services. We especially recommend solutions such as Jitsi, which run in the browser and makes use of the WebRTC API. Here, no user accounts are required. Every computer was constantly connected to a computer at another hub, so anyone could walk up to one of them, sit down, and start talking to someone, as in a typical conference coffee break.
At a single-location conference with remote presentations, the global foyer is primarily for remote presenters, whereas at a multi-location conference, all presenters may take advantage of it. We recommend setting up a 15- or 30-minute private communication timeslot for each presenter in advance. Discussion timeslots can be scheduled in a separate program, enabling speakers to communicate privately with interested audience members, either alone or in groups. A student assistant might get the task of organizing advisory meetings between senior and junior remote and local participants. The global foyer also gives local participants the chance to communicate with anyone anywhere at any time, e.g. during breaks.
Return on investment
Given the resources and time needed to set up livestreaming, and the constant risk of technical problems that could delay the conference program, it is interesting to ask whether the overall costs of such a project are offset by the overall benefits. To answer this question, we first need to evaluate the benefits, which in a first-order estimate can be done in US dollars. That includes benefits to future generations (due to CO2 emissions reductions), new conference participants (who would not otherwise have been able to participate), regular participants (who get access to a broader range of colleagues with whom to interact and research projects from which to learn), and participants with disabilities or caring commitments (who, like all participants, gain access to the entire conference program).
Estimates of the social cost of carbon range widely, for example $10–$200 per tonne CO2 (Interagency Working Group on Social Cost of Greenhouse Gases, 2016) or $177–$805 per tonne CO2 (Ricke et al., 2018). If for purpose of argument we take the value $100, a conference in which 100 participants avoid an intercontinental flight saves about 100 tonnes of carbon or 370 tonnes of CO2, valued at $37,000, and corresponding to perhaps 1/3 of our total conference budget.
If 100 people can participate in a conference who would not otherwise have been able to do so, and the value of this change is assumed to depend on what they would have paid to participate at a conventional conference–estimated at $1000 each—the benefit of this aspect may be valued at $100,000 or roughly our entire conference budget.
If 100 regular conference participants gain access to 50% more potential colleagues in other countries and 50% more high-quality, relevant research projects, their willingness to pay a higher price for registration might be valued at $100 each, making a relatively modest benefit of $10,000.
Participants with disabilities or caring commitments may be unable to attend a regular academic conference. Streaming makes the entire program available to all. This benefit is felt most strongly by participants who cannot otherwise participate, even if they still miss some or all of the face-to-face interaction. If ten participants experience this benefit, and the average participant invests $1000 in the conference (travel, registration, and accommodation), the total benefit is of the order of $10,000.
However calculated, these benefits far exceed the costs of purchasing, setting up, testing, and running the necessary equipment. For ICMP15/ESCOM10, we invested about €5800 in wages for a head technician (6 months, 10 hours per week) and a few hundred Euros each for four additional technical assistants at the main hub in Graz (Master’s students who also received course credits; €1 ≈ $1.1). We also spent some €200 on a Zoom upgrade, €1100 on wireless microphone rental, and €400 on other electronic equipment. Adding these together, the total technical costs were about €9,000 (8% of our total budget of €110,000). Other hubs spent less: their head technicians worked for only a few weeks and they needed only a few hundred Euros for equipment hire. The head technician was necessary in Graz to test different technical options during the months preceding the conference; for future conferences based on our model, this cost will be reduced. The other hubs did not need to carry out such tests, but instead followed our guidelines; our head technician conducted tests with each of them separately, communicating mainly by Zoom, Skype, and Whatsapp. Expenses that were the same as for a conventional conference included wages for a conference co-organizer (the third author, who was responsible for the peer-review procedure and many other organizational tasks) and €1200 for conference organization software (ConfTool).
Evaluation
Academics are experts in the art of evaluation (peer review, teaching evaluation). It is understandable to want to evaluate a new conference format thoroughly in advance of implementation. At ICMPC15/ESCOM10, we realized that a new conference format cannot be evaluated without experiencing it and getting used to it first. We noticed that participants were changing their minds about our approach during the event, and some continued to change their minds about it in the following weeks and months.
We also observed that opinions about such an issue can be very diverse. Our impression is that student assistants (hospitality and technology) had a more positive opinion, on average, than regular conference participants. In addition, younger participants had a more positive opinion than older, perhaps because they were more open to the technology (similar to social media) or because climate change will affect them more than it affects older people. For these reasons, it may therefore not be possible to speak of an “average response”.
On the second-last day of the conference, we asked all participants at all hubs to access an internet-based evaluation form. Of about 600 participants, 199 took part. We presented summary results in the closing session.
The survey asked for each participant’s active or passive role, physical location, and overall rating of the conference experience. Of the 199 survey participants, 84% were active (59% speakers; 25% poster presenters). The breakdown by physical location was 55% in Graz, 29% Montreal, 7% La Plata, 6% Sydney, and 3% no hub (remote participants).
We then asked participants to rate the semi-virtual format on an 11-point scale from very bad to very good. Of those that avoided the middle point of the scale, 61% responded positively. Satisfaction was highest in La Plata, where most participants could not have afforded to travel to a conventional conference in Graz, followed by Sydney and Graz. Satisfaction was lowest in Montreal, where many participants would have preferred to fly to Graz. The number of participants in Graz was about twice that in Montreal, but the two hubs otherwise had equal status on the conference program. Differences in satisfaction across hubs might be avoided in future by making hubs more equal in size. For this purpose, we could have created an additional hub elsewhere in Europe, such as in the UK. The hubs in Austria, UK, and Canada would then have been more similar in size.
No statistics were recorded about the typical size of audiences for live versus virtual talks. We recommend collecting that data in future. Anecdotal evidence suggests that audiences were bigger for live talks, but only at larger hubs. First, live talks work better for the audience than virtual talks (but with improvements in technology and increasing familiarity with virtual communication, this difference will gradually disappear). Second, at the larger hubs there were relatively many live talks of high academic quality to choose among. At smaller hubs, the difference was smaller due to the higher academic quality of remote talks relative to live talks.
Conclusion
At ICMPC15/ESCOM10, every talk was live-streamed and seen at one other hub, either in real time or after a delay due to international time differences. The discussion following each talk always involved two hubs. Real-time discussions were acoustic, and delayed discussions were written. In addition, all talks were available to all participants to watch and comment on using their laptops, tables and mobile phones. Because we carefully rehearsed procedures with colleagues at all hubs, no talk was canceled or delayed for technical reasons.
We consider the benefits of the semi-virtual approach to outweigh the disadvantages by a considerable margin. The main disadvantage is lack of face-to-face contact with colleagues from distant countries. This is more than counterbalanced by allowing new colleagues to participate who would not otherwise have been able to afford it or to travel to the conference (increasing accessibility, equity, and cultural diversity) and reducing climate-damaging GHG emissions.
A semi-virtual, multiple-location conference, with hubs on different continents around the globe, can for the first time reasonably be called “global”. The new format makes it possible to aspire to and approach a global balance among representative geographic areas. An equivalent one-location conference typically attracts many more participants from the continent where it is located than from other continents.
A semi-virtual conference format can be used to promote socioeconomic and cultural diversity among the participants by including one or more hubs in lower-GDP or culturally contrasting countries. This strategy ultimately impacts positively on the relevance and quality of the academic content. Colleagues from non-rich countries will initially have less experience with such events, and younger colleagues may not enjoy the same level of academic supervision from older colleagues. But if the conference is repeated periodically, academic levels in the non-rich countries will increase, positively impacting the academic quality of the entire event, which in turn will positively impact the discipline. More generally, this strategy will make a new positive contribution to global development.
At our conference, GHG emissions per participant were reduced by 60–70% relative to an equivalent single-location conference. We estimated this by asking participants at registration how they traveled to the conference. For practical reasons at the different hubs, this data was not always collected; where data were missing we estimated the carbon footprint of each participant by making reasonable assumptions about typical travel patterns. In future, emissions and international time-difference problems could be further reduced by adding more hubs.
The semi-virtual format assumes a network hub structure (not a hierarchy) and places no limit on the number of hubs, just as the internet places no limit on the number of servers in the world. At a semi-virtual conference with many hubs (say, 10–20), each hub would propose its own live program to all the others, after which each hub would choose its virtual program from the offerings of the other hubs. All participants would still have the chance to see any talk virtually, either in real time or later, which is not possible at a conventional conference with parallel sessions.
Colleagues considering a low-GHG conference of this kind may be wary of changing an existing, successful tradition. We were similarly cautious. While preparing for ICMPC15/ESCOM10, we tried out several different technological and logistic solutions. In advance of the conference, we were unsure how our technological solutions would be received by participants. For either of these reasons, we might have given up and returned to a conventional format. But doing so would have delayed a long-overdue reform.
During and after the conference, we noticed that skeptical colleagues became less so. That would be consistent with the psychological finding that acceptance increases with familiarity (cf. Kang & Gretzel, 2012). Like preference for music, preference for an electronic conference format may depend primarily on a combination of complexity (optimum complexity being preferred, neither too simple nor too complex) and familiarity (North & Hargreaves, 1995). “Individuals who have greater familiarity with technology in general, those with higher educational levels, and those who have greater prior experiences are likely to have more positive beliefs about new technologies” (Agarwal & Prasad, 1999, p. 385).
One- and two-way AV communication is not the only way to reduce the CO2 emissions of conferences and improve accessibility for distant, disabled, financially disadvantaged, or otherwise less mobile participants. Another promising approach involves telepresence robots (Neustaedter et al., 2018). This idea was beyond the scope of our 2018 conference due to the additional cost and technological complexity in hardware and software. It is nonetheless a promising avenue to explore. Our multi-location format may in the future be combined with telepresence robots or other emerging technologies to more closely achieve our academic, personal, and environmental goals.
Our experience with ICMPC15/ESCOM10 allows us to make the following predictions for the coming decades. First, low-GHG conferences will become the norm rather than the exception. As the global climate crisis escalates, academics will increasingly reject environmentally damaging, elitist, single-location academic conferences. Instead, they will take the opportunity offered by modern internet communication technology to open up their research traditions to colleagues from non-rich countries and in that way contribute to international development efforts such as the Sustainable Development Goals of the United Nations. Second, live streams and videos will increasingly be regarded as normal forms of academic dissemination, alongside more traditional conference proceedings, peer-reviewed journal articles, book chapters, monographs, and popular media reports. Each kind of dissemination will be seen as having its own special uses and functions. Sometimes it is easier to watch a good video than to read an academic paper. Individual colleagues can only benefit from this additional possibility, which they can either use or ignore as they see fit.
Other conference organizers with similar ambitions can copy our approach or adapt it for their purposes. We will be glad to participate in discussions about adapted or alternative formats.
Appendix
Setting up a YouTube live stream
The following guideline was written in 2018. There will be many detailed changes in coming years, but they are likely to be self-explanatory. The basic principles should remain stable.
Note that YouTube alone will not create a stream that can show both the speaker and the speaker’s screen. The user needs to encode the data and send it to the YouTube server—either with hardware or software such as OBS (see below).
After logging in to your YouTube account, click on your logo icon in the upper right corner. Select “Creator Studio”.
Select “Live Streaming” on the right. Under “Event”, you have the option to schedule streams by clicking on “New Live Event” in the upper right corner. You can now enter the name and starting time of the event, and copy the stream data provided at the bottom of the page to your encoder or OBS.
After going back to “Live Streaming” à “Events”, you can enter the “Live control room”. Here you can monitor the stream in real time. If all options on the encoder or OBS are correct and it is running, the stream should be active.
Using Open Broadcast Software (OBS)
A live stream must be created in OBS before starting the YouTube stream.
After opening OBS, add two new sources in the middle of the screen: Video Capture Device for the camera and Display Capture for the Powerpoint file.
Move the camera picture to the lower right corner and the display capture to the middle of the screen. We recommend to use 4:3 format for the slides, since if you move the screen capture correctly, there will be space left for the camera picture. If you are using an HDMI Grabber, it will also show up as source and you will use it instead of the screen capture.
Now go to Settings in the lower right corner and select Stream. Paste the stream name from the YouTube setup to the Stream Key field.
After hitting Start stream, a green box should appear and the stream will be live.
Checklist for technical assistants
Here is what the technician in each live room did before the start of each session at our conference at the Graz hub, with a hardware encoder. Every conference will have a different setup and a different list.
check TACT is on
start Zoom meeting with audience computer and display on big screen
join Zoom on presenter computer
connect to other hub
check presenter computer didn’t join audio, but video is activated
check correct mic inputs and correct output are active
say hello on all 3 mics (audience, chair, presenter) to check the level in the room
wait for your remote colleague to confirm that the level is good on the far end
open Powerpoint on presenter computer
open the first presentation and activate virtual laser-pointer in Powerpoint
display presenter PC on big screen
check streaming link is correct
check video mode is correct on AV mixer: presentation large, camera small
start encoder on tech computer
start YouTube stream on tech computer
open stream in new tab
check the stream, also audio
When it’s time, tell the presenter to start and mute the mic in Zoom
After the talk – before the discussion:
switch to Zoom on audience computer
if the presenter wants to use a screenshare, help
check the second assistant has access to Moodle/YouTube on the tech computer
Acknowledgments
We thank the hub organizers and their assistants for making ICMPC15/ESCOM10 possible: Christine Beckett and Eldad Tsabary (Concordia University, Montréal, Canada), Isabel Cecilia Martínez (Univercidad Nacional De La Plata, Argentina), and Emery Schubert (University of New South Wales, Sydney, Australia). The conference was supported financially or institutionally by SEMPRE (Society for Education, Music, and Psychology Research), Land Steiermark (Province of Styria, Austria), University of Graz, UNLP (Universidad Nacional De La Plata), ESCOM (European Society for the Cognitive Sciences of Music), SMPC (Society for Music Perception and Cognition), AMPS (Australian Music Psychology Society), and Österreichische Forschungsgemeinschaft (Austrian Research Association). For the assessment of the conference’s carbon footprint, we thank Jakob Mayer, Wegener Centre for Climate and Global Change, Graz.
Competing interests
The authors have no competing interests to declare.
Author contributions
Concept, conference organization, text: RP
Technological implementation and text revision: NMK
Conference co-organization and text revision: SS