The Community for Technology Leaders

A Mobile Live Video Learning System for Large-Scale Learning—System Design and Evaluation

Carsten Ullrich
Ruimin Shen
Ren Tong
Xiaohong Tan

Pages: pp. 6-17

Abstract—In China, the number of university students has quadrupled in only six years. How can technology support the access to education of these and future students? In this paper, we describe the mobile live video learning system developed at the Shanghai Jiao Tong University. Motivated by the observation that in developing countries, mobile phones have a much higher penetration rate than laptop and desktop computers, we developed a mobile learning system that streams live lectures to the students' mobile devices. The lectures are held as usual in university, not requiring the costly preparation of especially authored mobile learning materials. The system takes care of compressing the video and audio data efficiently so that it can be live-streamed, while maintaining high visual quality of the slides. Due to the synchronous (live) nature of the system, students can interact with the teacher during the lecture, using a set of preprogrammed interactions that facilitate feedback with mobile devices with limited input facilities. Large-scale evaluations in two lectures with 1000 students each show that students find using the system beneficial. In sum, the mobile live video learning system offers a convenient and cost-effective way of making higher education accessible to large number of students.

Index Terms—Learning, mobile computing, mobile learning, mobile multimedia applications, mobile video.


We present the outcome of a project that employs mobile technology to provide live access to video lectures education to the largest number of citizens possible. We describe the system architecture and the outcome of two large-scale evaluations in two lectures with about 1,000 students at the Distant College of Shanghai Jiao Tong University (Online-SJTU), an online college with 26,000 students.

Referring to reasons why an institution focuses on mobile learning as identified in [ 14], our motivations for the reported research are to improve access and alignment with institutional aims—in our case triggered by society. In brief, our work investigates the usage of mobile phones as an additional distribution channel: currently, students at Online-SJTU can attend the classroom in person or watch it live streamed via Web and IPTV (and also watch it at a later time).

Often, research that investigates distribution is criticized as "traditional" or even "simplistic." For instance, [ 3] states that mobile learning is "often associated with a simplistic understanding of facilitating learning by delivering instructional content.... This simplistic view ignores the fact that modern education and pedagogy ... converge in their high valuation of active ... learning methods much beyond the absorption of codified knowledge." We are aware that broadcasting content is not optimal for learning—but we argue that sometimes it is necessary to work within existing structures. As [ 26] points out, today's state of education in developing countries leads to challenges that need to be tackled by research in mobile learning, even though they embody a didactic approach: "Mobile learning in these parts of the world is a reaction to different challenges and different limitations—usually those of infrastructure, poverty, distance, or sparsity." Similarly, [ 16] defends an approach that appears to be technology-driven rather than by specific educational needs: "The problem is that the educational needs are so vast...; one risks 'feature scope creep,' or scoping needs that would be met only much further into the future."

So, what precisely are these needs? In China, one goal is to enable access to education to the largest number of citizens possible. In the recent years, the Chinese government significantly invested in tertiary education. Li et al. [ 15] state that since 1999, "the number of undergraduate and graduate students in China has been grown [sic] at approximately 30 percent per year..., and the number of graduates at all levels of higher education in China has approximately quadrupled in the last 6 years." Thus, China's higher-educational institutions have to manage numbers of graduates that grew from 830,000 in 1998 to 3,068,000 in 2005. Africa faces similar challenges. For instance, Nigeria's universities can accommodate only 20 percent of those seeking admission [ 1].

In developing countries, mobile phones are a better suited target device than desktop or laptop computers. There, the penetration rate of mobile phones surpasses that of home computers significantly as shown by recent figures of the China Internet Network Information Center [ 5]. The July 2008, survey reports 84.7 million computers connected to the Internet (including desktop and laptop computers) compared to 592 million mobile phone numbers (growing at a rate of 18 percent). An increasing number of users accesses the Internet using mobile phones. Of the 253 million Internet users in China, about a third (84.7 million) surf the Web with their mobile phones, 22.65 million more than in the first half of 2008. The proportion of desktop Internet users is actually dropping compared to the proportion of mobile netizens. This trend holds elsewhere, too. According to the International Telecommunication Union [ 9], in 2007 the fixed broadband penetration rate in Africa was 0.2 percent, compared to 27 percent mobile penetration rate.

As a consequence, the usage of mobile devices for learning has been explored in countries such as the Philippines, Mongolia, Egypt, Nigeria, and South Africa (for overviews see [ 17], [ 1], [ 26]). These projects are primarily based on SMS. SMS is a technology that is readily available in mobile networks and does not require high-speed connections, and thus, can be exploited immediately. In contrast, Shanghai and other first-tier cities in China are in a particular position, compared to the fragmented and technology-impoverished countries of Africa but also compared to the western and middle parts of China itself. These cities possess a highly developed infrastructure that includes fixed broadband (ADSL) and advanced mobile networks (GPRS and 3G). The infrastructure is at least as advanced as in Western cities, if not further. On the other hand, these cities still face the above described problem typical of developing countries: there is insufficient access to education for large numbers of citizens.

In 1998, the Chinese government initiated the creation of distant universities and colleges, attached to regular universities, 63 of them today in total. Research and projects performed in these universities will influence decisions in third-tier cities all over China. This context results in constraints induced by strategic factors, which include resources (such as time and money), prevalent practices of the involved institutions and also expectations of staff [ 25]. Specifically, any solution that we want to apply immediately to improve the access to education has to be acceptable by the current institution and teachers without requiring a change of thinking about education.

The contribution of our work is the following. We present a Mobile Live Video Learning System (MLVLS) that builds on top of an existing video lecture environment and enables streaming the lectures to mobile phones. From a technical perspective, the work advances the state of the art of video lectures. It is the first system that streams video lectures to mobile devices (using the GPRS network), using a mixture of existing and self-developed codecs. The usability problem arising from the small screens is addressed by having different views on the video lecture; an approach that does not require any postprocessing. From a strategic point of view, the MLVLS fits in the established and prevalent mode of instruction, and thus, is applicable immediately. It does not require special and potentially costly authoring, besides the usual preparation of the lecture.

The paper is structured as follows: We start by describing related work. In the subsequent section, we describe the required technological prerequisites. Section 4 lays out the MLVLS itself, starting with the use cases that informed its design, followed by an overview on the architecture. Sections 5 and 6 describe the major subsystems of the MLVLS. Then, Section 7 reports evaluation results about students' acceptance of the MLVLS. In the conclusion (Section 8), we take a step back and look at the technical problems resulting from real-life usage and discuss future work.

Related Work

Mobile learning is an active and vast area of research (recent overviews are given in [ 11], [ 13]). In the following, we discuss work done on mobile learning in Asia and Africa, as well as research that has a similar technical focus (classroom-based/video lectures).

The MobilED project (South Africa) [ 6] investigates the usage of mobile audio-Wikipedia, based on SMS and text-to-speech technologies to enable voice-based access and contribution to the Wiki.

Research presented in [ 8] explores the support of postgraduate distance learning students based in Southern Africa. This project involved the development of special courseware stored in the mobile phone (not downloaded live).

The Digital Education Enhancement Project (DEEP) [ 26] took place in primary schools Egypt and Eastern Cape, South Africa. There, specifically developed learning materials were stored on handhelds. The paper also discusses a project of the University of Pretoria that explored the use of SMS for academic learning support purposes.

These projects have in common that the infrastructure in which they were running was significantly less advanced than in our setting and relied either on learning materials that were stored on the mobile phone prior to usage or on SMS.

Mobile video learning is technically challenging due to the limited computational power of mobile devices, storage, and bandwidth. Most related research has been done in domain of learning in museums [ 4], [ 20], [ 31]. In these approaches, videos guiding through an exhibition and explaining details of artworks are stored on the device or broadcasted through the local broadband connection, in contrast to the live stream enabled by our work. For those systems that were evaluated, the subjects reported increased satisfaction when presented with videos.

Thornton and Houser [ 24] present one of the few studies on mobile learning that involve video. They prepared 15-second videos illustrating English idioms presented to Japanese students. In their evaluation, 31 college sophomores spent 10 minutes looking through the learning materials and then evaluated various aspects of the system, including the videos, educational effectiveness, and overall reaction. The authors report general satisfaction of their subjects and only few complaints about the limited screen size of the devices.

A similar study was performed by [ 21]. They developed learning materials for downloading but also a course for mobile access with PDAs. In their trials (9 and 18 users), subjects reported difficulty with multimedia materials that included fine details.

These technical difficulties are also stressed by [ 12], [ 24]: "the Achilles heel of video and 3D animation is the time they require to prepare."

While in an ideal world, it would be possible to produce specially designed learning materials for mobile devices, in practice is it often simply impossible due to time and resource constraints, especially if the foremost goal is to enable access to education to large parts of the population as fast as possible.

Basic Technology

This section describes the technological foundations of MLVLS' system design. We start by discussing the development platform, followed by an overview on GPRS, which was used as the streaming medium for the video lectures. Finally, we will present the employed codecs, that is, the software used for compressing the video and audio signal to reduce bandwidth.

3.1 Mobile Usability

Developing for a mobile setting brings forth its own challenges. The limited size of the mobile devices enables portability but has consequences on usability. The small screen size, compared to laptop/desktop computers, is often quoted as making reading and viewing information difficult, even though this is not automatically the case [ 12]. In results reported by [ 24], the limitations depended on the modality: text was read fine, but details got lost in images and video. This needs to be taken into account when transmitting visual information such as slides. In any case, designing learning materials specifically for mobile devices is a difficult and resource-intensive task [ 2]. In our setting where we aim at a wide adoption, it was not an option to require the teachers to author additional learning material just for the mobile learners. In such a case, only very few teachers would use the MVLS. We, therefore, opted for an approach that streams the standard video lectures to the mobile devices. To address readability issues, the system transmits three different views to the clients (shown in Fig. 1):

  • Large resolution view of the teaching screen (slide-view). This view typically displays the slides, and thus, needs to be clearly readable. We decided to broadcast this screen in $320\times 240$ px, a format higher than the typical screen resolution of a mobile device (usually $176\times 144$ px). Within this screen, the user can zoom out to get a complete overview and zoom in to see details.
  • Slide- and teacher-view. In this view, the user sees a close-up of the lecturer superimposed on the teaching screen ( $64\times 80$ px). Thus, the student sees both the learning material as well as the teacher explaining it.
  • Teacher-view. The teacher-view consists of a larger resolution view of the lecturer ( $176\times 144$ px).

3.2 Client Platform

The proliferation of platforms and mobile device models makes developing mobile learning software difficult. We chose the market leader, Symbian OS.

Graphic: Fig. 1. The three views of the live lecture: (a) slide-view, (b) slide and teacher-view, and (c) teacher-view.

Figure    Fig. 1. The three views of the live lecture: (a) slide-view, (b) slide and teacher-view, and (c) teacher-view.

In 2007, when the MVLS was designed and implemented, the Symbian OS had a market share of almost 70 percent. In the first quarter of 2009, Symbian OS is still market leader although the share has declined to about 50 percent due to increasing competition from RIM (Blackberry, 20 percent), and the iPhone (10 percent) [ 23].

Symbian is a real-time, multitasking, 32-bit operating system, with low power consumption and memory footprint. It is a stable and mature system, with support for all existing wireless data communication protocols. For these reasons, we selected it as the operating system platform for our mobile learning software.

3.3 The GPRS Network

GPRS (General Packet Radio Service) is a mobile data service, used for mobile Internet access. Compared to the original GSM dial-up access based on a circuit data transmission, GPRS is a packet switching technology, which provides relatively fast transmission times and offers highly efficient, low-cost wireless data services.

3.4 Reused Video and Audio Codecs

Codecs compress and decompress digital data. The purpose of video/audio codes is to reduce the size of the data to reduce bandwidth load while retaining sufficient information for the data to be understood by its consumers. Compression is especially important in a mobile context since mobile transmission channels still have limited bandwidth compared to desktop Internet connections.

Analogue video information (a film reel) consists of a number of still pictures. In the digital world, the still pictures are called frames. Video compression relies on the fact that often only parts of the image change between two frames and provides means to encode only the differences up to a certain point, when again a complete still image is used. Those frames that are represented only by these differences are called inner frames.

H.264 is a widely used, highly efficient, but also highly complex digital video codec standard [ 29], [ 10]. It is more efficient than its predecessor H.263++ and particular attractive due to its wide application range, which caters to different speeds, different resolutions, and different networks. The standard has the following features:

  • Low bit rate: compared to MPEG-2 and MPEG-4, the size of data compressed with H.264 is 1/8 of that of MPEG-2 and 1/3 of MPEG-4, with similar image quality. Thus, using H.264 reduces download time and data transfer costs.
  • High quality: standard video data processed with H.264 has high quality when played back.
  • Strong fault-tolerance: H.264 deals efficiently with errors due to unstable network environments, such as packet loss.
  • Strong network adaptation ability: H.264 integrates a network adaptation layer that simplifies the transmission of H.264 documents in different networks.

Even though audio information requires less space than video documents, it is still very large and requires compression, too. The standard MPEG-4 accPlus [ 30] combines three kinds of technologies: Advanced Audio Coding (AAC), Spectral Band Replication of Coding Technologies (SBR), and Parametric Stereo (PS). AAC is a lossy compression scheme for audio information that achieves a better sound quality than the commonly used mp3. SBR is a bandwidth extension technique that encodes audio information with the same quality as the original but requiring less than half of the original bit-rate. PS greatly improves the efficiency of low-bit-rate stereo signals. SBR and PS maintain backward and forward compatibility, that is, they can be played by old players and will remain playable by not yet build ones. aacPlus is the newest member of the MPEG-4 audio technology family and represents today's state of the art of low bit rate open standards for audio codecs. We, therefore, decided to use this standard for the MLVLS.

3.5 Slide Compression

In addition to presenting a video of the lecturer, our mobile video lectures contain a video of the slides, including all interactions with the slides, such as drawing and highlighting. In our context, the slide-view contains the visual information of the video lecture that needs to be visible in fine detail. Since existing codecs were not designed for this specific task, we had to develop our own techniques to display a clearly readable video on the small screen of the mobile device (usually 176∗208 pixels). Our codec exploits the following differences between the slide-view and the usual content of videos:

  • It consists of slides, which contain primarily textual information and less complex objects than a movie.
  • Navigation in slides (resulting in changes in the displayed information) can go forwards and backwards.
  • Screen changes other than slide navigation are usually partial changes (e.g., mouse movements, dragging of windows).
  • Frames need to be sampled at a high rate in order to capture quickly occurring changes such as handwriting and slide switching.

We developed a codec that specifically deals with this kind of video information. The main idea of the algorithm is to combine lossless and lossy codecs: text, windows, and other simple information is encoded lossless, while image and audio information is encoded with a lossy codec. Inner frames use motion compensation technology for encoding the differences between frames and skip redundant data. Additional saving of bandwidth is obtained by making use of the fact that navigation through slides also goes backwards. A history database stores the information for each slide and uses this information if the teacher goes back to an earlier slide and then forward again. A special encoding captures mouse movements since the slide content does not change when the mouse is moved.

System Architecture

4.1 Context and Use Cases

The MLVLS was developed for the online college of Shanghai Jiao Tong University (Online-SJTU). The students at Online-SJTU are vocational learners who come from varying social and educational backgrounds. Some have just finished high school or a college while others have already been working for several years. The class size at Online-SJTU varies from 50 to 3,000 students, which is not uncommon for a densely populated city like Shanghai. Lectures are held in the evenings after work and on weekends. Students can attend the lecture in the classroom, and they can also watch the lecture live online using a Web browser and a broadband connection. Being synchronous, live lectures offer the advantage that students can directly interact with the teacher in case problems arise. While the lectures can be attended from everywhere in China, most of the students are from Shanghai. A survey among the students has shown that the possibility to come to the classroom in person was a decisive factor when selecting their online college. One reason for this might be the teacher-centered view on learning still prevalent in the Confucian culture of China [ 32].

The guiding principle of the research at Online-SJTU is to realize and support a "standard natural classroom" [ 22]. The standard natural classroom supports the teachers by technology, which adapts to them, not vice versa. This means that lecturers can follow their usual teaching procedure: they author slides, present them in the lecture, and can freely write on the slides via a touch screen. Then, the lectures are recorded, compressed, and distributed to the students over different connections (ADSL, IPTV, and mobile network). The slides, including handwritten comments and drawings, are also recorded and made available to the students. In short, the teachers follow their usual procedure, supported and enhanced by technology.

This approach allows a very cost-effective production of up-to-date learning materials. Teachers are not required to learn any technical skills in addition to those necessary for offline lectures. We admit that these online learning materials might not make optimal use of the potential that the medium Web offers. However, in the current situation, with the number of students having quadrupled in only six years, it is simply not feasible for teachers to spend time on authoring Web-based learning materials.

The goal of the MLVLS was to open an additional distribution channel. Making the live lectures accessible to the mobile phone enables an even greater percentage of students to access the lectures if they are unable to attend the classroom in person or do not have a computer or laptop at hand.

We identified three categories of users of the MLVLS: teachers, students, and administrators.

  • Teachers. The MLVLS needs to support and transmit the various activities of the teachers. Several of our teachers, especially the mathematics teachers expressed the wish to be able to construct formulas step by step using handwriting. The slides together with the handwritten information are the most relevant visual information during the lecture, more important than the view of the lecturer himself, and thus, need to be transmitted in high-quality. Additionally, the teachers need to be able to receive feedback from students not present in the classroom. This includes active feedback such as question asking, but also ideally the teacher should be able to see the current actions of the student, in a similar way as he sees those of the students present in the classroom.
  • Students accessing the MLVLS need to be able to see which lecturers are currently available (the lecture timetable) and to select the one they want to attend. While watching the lecture, they should be able to focus on the slides and to zoom on the lecturer. They need to be able to give quick feedback regarding the overall quality of the course, more precisely the teacher's speaking speed, handwriting readability, and transmission quality. If required, they should be able to send more detailed feedback, questions, and contributions.
  • The administrators are responsible for the maintenance of the servers, ensure that the lectures are properly recorded and administer the school's timetable. They take care of all technical details, so that the teachers can work according to the standard natural classroom paradigm and focus on their teaching.

These user roles informed the subsequent design and implementation of the MLVLS.

4.2 Overview of the MLVLS Architecture

The main functionality of the MLVLS can be separated in two different subsystems. The mobile phone broadcasting system is responsible for broadcasting the live lectures to the mobile phones. The classroom management system takes care of the students' interactions with the lectures. We call this view of the MLVLS that separates between the two subsystems the logical architecture of the MLVLS.

This logical architecture is instantiated in a physical structure, which can be divided into the central server, classrooms, and mobile devices. The logical and physical architecture is shown in Fig. 2.

Graphic: Fig. 2. Overview on the MLVLS.

Figure    Fig. 2. Overview on the MLVLS.

In brief, the interactions between the different subsystems and their components are as follows: The mobile phone broadcasting system (illustrated in the figure by the light arrows) transmits in real time the lectures from the classroom to the students' mobile devices. Each classroom has an instructor station that consists of two screens. The teaching screen contains the information that is shown to the students (projected on a screen in the classroom and broadcast to the mobile devices) and usually contains the instructor's slides. The teaching screen is touch-sensitive. Using a stylus, the instructor can draw and write freely on whatever is displayed in the screen. The second screen, called feedback screen, facilitates direct communication from the students to the teacher. It is only visible to the teacher and contains text messages sent by students, results of polls, and mirrored views of the screens of the mobile phones of those students watching the lecture's live-stream. The events in the classroom are captured by a camera that records the teacher's upper part of the body, focusing on his facial expressions, and by a microphone that records the audio information. Additionally, the teaching screen is recorded. The data is compressed by the classroom recorder and forwarded to the broadcasting server that distributes it to the client viewers on the students' devices.

The system transmits the three different views to the clients (slide-view, slide and teacher-view, and teacher-view)

The second logical subsystem, the classroom management system, manages the students' interactions with the lectures (illustrated in Fig. 2 by the dark arrows). The curriculum schedule management system contains the list of scheduled and ongoing lectures. Students can access this system to download the current timetable to their phone and to start live-streaming a lecture. During a live-lecture, they can use the polling system and SMS interaction system to provide feedback via text messages and a number of predefined, quickly accessible polls. At the same time, the client screen monitoring system connects the teachers to the students watching the class. In the current implementation, teachers actually see a mirrored image of the screen of the student's phone. The rationale is to provide teachers with as much information as possible about those students not located in the classroom. If necessary, teachers can even take control of the student's phone.

4.3 Physical Locations of the Servers

The physical locations of the servers needs be carefully selected to achieve sufficiently high transmission rates. The connection between the classroom servers and the central servers should take place through a cable broadband connection (since cable offers the highest transmission rate), while the connection to the mobile devices must happen via the mobile GPRS network. The location of the servers needs to balance between both networks. In our case, the classroom recorders are located within the China Education and Research Network (CERNET). The connection to the GPRS network is performed in the China Mobile network. The central servers thus can be placed within CERNET or the China Mobile network, but also within the third larger provider in Shanghai, the Shanghai Telecom network. In the first case, the bandwidth between classroom and central servers is sufficiently high, but access to the GPRS is slow. In the two latter cases, it is the inverse: GPRS access is good, but access to classroom recorders is not.

We, therefore, had to perform extensive network speed testing. In different areas of Shanghai, we connected computers to the Internet using the GPRS connections of mobile phones. We then tested the connection speed with the broadcasting server being located in the China Mobile network, CERNET, and the Shanghai Telecom network, respectively. According to the data, the best connections to the broadcasting server were established when it was located in the China Mobile network. However, the connection performance to the classroom servers located in CERNET was not sufficient and varied in quality. Due to budget limitations, we were unable to rent a private line connecting the servers (with a guaranteed good performance), and thus, decided to place the broadcasting server within CERNET.

Mobile Phone Broadcasting System

The mobile phone broadcasting system transmits the video and audio data recorded in the classroom to the students' mobile devices in real time. Classroom recorders encode the data, the broadcasting server transmits it to the mobile phone, and the client viewer enables the students to view the video.

More precisely, the interactions between these components are as follows: an administrator configures the classroom recorders by setting the IP address of the broadcasting server and the virtual classroom number of the classroom, and then registers the virtual classroom at the broadcasting server. From that on, the classroom recorder will transmit the encoded classroom multimedia streams to the server. The students' client connects to the network using GPRS. Once the connection to the broadcasting server is established, the client sends a login message to the classroom. If the virtual classroom is active, the broadcasting server first sends to the student's client initial information required to set up the connection. After the client has initialized its player accordingly, it starts to accept the multimedia streams, decodes and displays them on the mobile phone.

We now take a close look at these three components.

5.1 Broadcasting Server

The broadcasting server is the core of the MLVLS. The server receives the compressed video and audio data and streams the selected view to the student. Students connect to this server, and thus, it needs to be accessible by a public IP address.

For $n$ real classrooms, the broadcasting server needs to support $n\times 2$ virtual classrooms, since each classroom transmits two streams: the slide-view and small video ( $64\times 80$ px) including audio, and a larger video ( $176\times 144$ px). The system currently supports 20 classrooms, the numbers of classrooms at the Online-SJTU. In the current implementation, we expect it to scale to up to 40 classrooms.

The components of the broadcasting server are the administration interface, the virtual classroom manager, the socket connection manager, and the data processing and transmission component.

  • The administration interface enables the administrators to control the technical setup of the MLVLS. They can start and stop the broadcasting server, and see the number of students currently attending a lecture.
  • The role of the virtual classroom manager is to administer the live lectures. It keeps track of which students are virtual members of a class (i.e., watching the live stream), manages interactions between teacher and students, and stores administrational data, such as the number of online students and online time of each classroom.
  • The socket connection manager handles students who have not yet entered a virtual classroom. Once a connection between the classroom and the student is established, the socket connection manager hands over the control to the virtual classroom manager. Before the connection to the decoder of the client viewer can be initiated, the socket connection manager has to send information required to initialize the decoder. This includes information about the employed audio and video codec (since the information which codec is used in the classroom server is configurable) and the video sizes.
  • The data processing and transmission component broadcasts to the students the audio and the selected video channels.

5.2 Classroom Recorders

The classroom recorders (one for each classroom) record the lectures, compress the data, and manage the teaching screens. The technical setup is performed by the administrators. They configure the codecs described in Section 3.4 (by setting the specific parameters), and connect the classroom recorder to the broadcasting server.

In detail, the classroom recorders consist of the following modules:

  • The administrators use the management module to connect a classroom recorder to the broadcasting server and to start and stop recordings.
  • Using the parameter module, an administrator configures the employed video and audio codecs and their parameters. Although currently the above described codecs are used, keeping the selection configurable allows accommodating for different devices and networks (for instance, this year the next generation of mobile networks, called 3G, was introduced in Shanghai).
  • The slide-view encoding module captures the teaching materials shown in the lecture, usually the slides and the handwritten comments and highlighting done by the teacher.
  • The mouse encoding module captures the teacher's mouse movements.
  • The video capture module captures and compresses the video taken of the lecturer.
  • The audio capture module captures and compresses the audio of the teacher.
  • The sync module buffers the video and audio data.
  • The transmission module reads the data from the buffer and sends it to the broadcasting server.

5.3 Client Viewers

The Mobile Learning Client Viewer allows a student to select and view a live lecture. Students need to install the program once to the phone. It runs on any Symbian S60 smart phone and was designed to operate in the same way as any other mobile phone application, and therefore, is consistent with the Symbian usability guidelines [ 18].

Through the client program, students log in to the curriculum schedule management system (for a detailed discussion on this component, see Section 6.1). There, they inspect the curriculum schedule for that day, navigate through the list of classes that are in progress and choose which class to virtually attend.

After students have connected to a class, the client program periodically sends screen captures to the instructor. This mechanism enables a teacher to supervise the individual students' learning behavior, if required. Students can communicate with the teacher by sending short messages directly from the viewer using the cell phone's Short Message Service (SMS). The students' messages are displayed on the instructor's feedback screen to inform him on their learning progress, questions, and any other feedback. Students can also take part in polls and activities started by the teacher. The polls assess various aspects of the teaching, namely pace, clarity, audio quality. The polling system generates the poll results and renders them in real time at the instructor's feedback screen so that they may adjust and improve their instruction.

Technically, the client viewer is divided into the live module and the interactive module. The live module handles the reception of the data, decodes and displays it. Included in the live module is the video player, which can play streamed video but also downloaded video lectures. Combining multiple streams within the comparably modest operating resources that a mobile phone offers is a challenge in itself. For this technical feat, our player received an award at the Nokia Open C challenge [ 19].

The interactive module connects the student to the curriculum schedule management system, provides the polling and SMS feedback and the transmission of the mobile's screenshot to the teacher.

Classroom Management System

The mobile phone broadcasting system pushes educational resources to the student. In contrast, the role of the classroom management system is to handle the administrative details of the virtual classrooms and to provide an upstream channel of communication from the student to the teacher. The technical constraints of mobile devices (limited upload bandwidth, time consuming text input, etc.) make this upstream communication difficult. We, therefore, designed easily usable communication tools that correspond to a number of frequently occurring participatory classroom events.

The classroom management system consists of four major subsystems: the curriculum schedule management system, the student screen capturing system, the polling system, and the SMS interaction system.

6.1 Curriculum Schedule Management System

After students log in to the MLVLS using the client viewer, they interact with the curriculum schedule management system to select the lecture to attend virtually. Due to the limited bandwidth and speed of the GPRS network, the transfer of the curriculum information needs to minimize the transferred information, e.g., by not sending duplicate or unnecessary information. In particular, for the design of the system, we investigated the following questions:

  • Which curriculum information will be stored on the phone?
  • How are timetable updates propagated to the phone?
  • How to avoid duplicated downloads in case the student logs out and in repeatedly?

Downloading the complete curriculum information on every login requires too much time and data volume. Therefore, students receive only the daily schedule, which is automatically generated from the overall course database every midnight. Additionally, the schedule contains the current date and a revision counter, which is automatically increased each time the schedule is changed. Once the student connects to the server, the server sends the current date and revision number of the schedule to the client. If the server date and the date in the client-side schedule differ, then the client will download the current schedule. The download also happens if the dates are equal but the revision number differs. We deliberately store the date information in the schedule and do not use the date information of the mobile device for two reasons: first, the client date information can be incorrect, and second, time zones differences between server and client might result in different dates.

As a result of this procedure, the student always has access to up-to-date curriculum schedule information without repeated download of an unchanged schedule. This procedure could be further refined by adding version information to each individual lecture description, which however, would lead to increased communication overhead (since the version information needs to be sent to the client).

6.2 Client Screen Monitoring System

A challenge for live mobile video instruction is the limited information the instructor has about the virtually attending students. In laptop- or desktop-based virtual learning, one can envision that a Web cam captures the student and provides visual information to the teacher. This is hard to achieve with a mobile device, since even if the phone had a camera, it would require that the student flips the phone backwards (for most devices), thus, making it impossible to view the lecture. However, it is possible to capture at least some information about the student's activities by making captures of the student's mobile screen and sending them to the teacher. Due to obvious privacy concerns, the student is informed of this feature and can activate and deactivate it any time.

If the feature is activated, a program running in the background sends screenshot to the teacher console every few seconds. The console displays a screenshot of each mobile student watching the lecture. The teacher has several ways to interact with these students. He can send a text message to an individual student, but also, if necessary, can take control of the phone. A virtual console emulates the phone controls on the feedback screen, which the teacher can use by clicking on them.

Technically, these interactions become possible due to the fact that the Symbian operating system supports multithreading (in contrast to the iPhone).

Fig. 3 shows an example of the client screen monitoring system. In the figure, four student screens are shown to the teacher (on the left). The three students on the top watch the class, using different views. The fourth student is playing a game. The teacher has focused on this student's phone and can communicate and control it using the virtual console on the right. In this case, the teacher has written a text message that will be shown to the student.

Graphic: Fig. 3. A screenshot of the teacher-only interface of the client screen monitoring system.

Figure    Fig. 3. A screenshot of the teacher-only interface of the client screen monitoring system.

6.3 Polling System

The polling system offers the mobile students a quick and convenient channel to give feedback about the most relevant and frequent problems occurring during a lecture. In particular, these are

  • the teaching speed, which can be too fast, good, or too slow;
  • the teacher's handwriting, which can be illegible, clear, or average;
  • the teacher's voice, which can be too loud, good, or too quiet.

Students can vote on these issues with a single keystroke. The information is sent to the teacher console, where three bar charts present the aggregated information. The charts are updated every 30 seconds. Fig. 4 shows the results as they are displayed to the lecturer and Fig. 5 the polling system on the student's mobile device.

Graphic: Fig. 4. Students' view on the polling system for feedback on voice. Option one says "too loud," option 2 "ok," and option 3 "too silent."

Figure    Fig. 4. Students' view on the polling system for feedback on voice. Option one says "too loud," option 2 "ok," and option 3 "too silent."

Graphic: Fig. 5. View of the polling results showing students' feedback on teaching speed (first three columns), handwriting (second three columns), and voice (last three columns). The heading says "Polling results for classroom 503," while the legends of the three columns say "speed too quick, ok, too slow," "handwriting illegible, clear, or average," and voice "too low, ok, too loud," respectively.

Figure    Fig. 5. View of the polling results showing students' feedback on teaching speed (first three columns), handwriting (second three columns), and voice (last three columns). The heading says "Polling results for classroom 503," while the legends of the three columns say "speed too quick, ok, too slow," "handwriting illegible, clear, or average," and voice "too low, ok, too loud," respectively.

6.4 SMS Interaction System

If during the lecture, a question or problem arises that cannot be resolved through the polling system, mobile students require an additional means to communicate with the teacher. In the MLVLS, students can compose a SMS using the client viewer. The SMS is automatically addressed to the MLVLS service center and the message text starts with an identifier of the student's current classroom. The SMS interaction system stores all received SMS in a database and every 30 seconds pushes the newly received messages to the teacher's feedback screen. Obviously, due to the limited time available to the instructors, they are not expected to address each and every SMS in a class of 3,000 students. However, according to our experience the communication in a regular class of about 200 students is manageable. There the SMS enable the teacher to get an understanding of the problems of the mobile students. Important questions are addressed immediately; the remaining ones can be answered after class.


The MLVLS was evaluated in two large classes, a computer science and an English course both taught at Online-SJTU and attended by more than 1,000 students.

Evaluations in e-learning, in particular in mobile learning are a difficult task [ 27], [ 11]. In particular, in our context, the students were paying customers who attend the lectures in order to receive a degree. We decided that it was not ethical in the context of this work to divide the students in an experimental and a control group, where the former has access to additional technology and the latter has not. This might have been possible for a short experimental setting but not for the semester long studies that we performed. Additional difficulties arouse because there was no clear cut distinction between online (either using the standard access or the MLVLS) and offline students. Almost all Online-SJTU students attend some of the classes in person and some virtually. This made it difficult to assess learning outcomes between users and nonusers of the MLVLS.

We, therefore, decided to base our evaluations on pre/post surveys that investigated the general view and perception of the learners regarding the MLVLS. Even such a general usability study can yield relevant results since inadequate usability of a product may lead to its rejection [ 11]. The survey has not been used before.

We first describe the involved classes, and then the results of using the MLVLS in the computer science class, followed by those of the English class. The conclusion we drew from the evaluation are discussed in Section 7.4. Earlier publications [ 7], [ 28] about data collected in these two lectures reported on different types of mobile learning activities undertaken during the evaluations. The evaluations discussed in this paper focus on the usability of the MLVLS, and thus, present additional results not present in the other publications.

7.1 Subjects/Presurvey

The MVLS was evaluated in summer 2007 in two classes, involving the same body of students (1,000). In that semester, these students had to take both a Computer Science and Advanced English Class. The Computer Science class was an introductory course in Computer Science, including Internet technology and office applications. The upper-level English course prepares students to take a national standardized English test. Since the students participate in both classes, the presurvey encompasses data of all the students.

The presurvey was completed by almost half of the students (447 out of 1,000). According to the collected data, students were new to mobile learning, somewhat interested but skeptical of the mobile learning. More specifically, only a small percentage (9.5 percent) ever used mobile learning prior to the class. About a third (33.55 percent) said that they were willing to study using their mobile phones, 17.5 percent were undecided (being unfamiliar with mobile learning). Strikingly, however, almost half (48.6 percent) stated that they were unwilling to use mobile learning. When asked for the reasons, the students' primary concern was about current technical constraints (25 percent). 15 percent worried about the online costs associated with mobile learning. Some students (7 percent) feared that the mobile devices would have negative impact on teaching and learning quality.

The drop-out rate in the two classes was about the standard college-wide rate of 5 percent.

The postsurvey of the MLVLS system was made difficult by the fact that in addition to the MLVLS both teachers wanted to test specific activities in their lectures and imposed their own survey format. We, therefore, present the results in two separate sections.

7.2 Postsurvey Results for the Computer Science Class

The postsurvey was conducted at the end of the semester, after the mobile system was introduced and used with class. A total of 242 students responded. In contrast to pretest, the post-test participation was voluntary, which presumably resulted in the substantial difference in participation to the pretest response. The Cronbach's Alpha value of ${\rm A} = .948$ (number of cases $\!= 242$ , number of survey items $\!= 15$ ), indicates that this survey yields reliable results (greater than .90). The students felt satisfied with this course, with 28.3 percent being somewhat satisfied, 59.8 percent being satisfied, and 9.8 percent being very satisfied

Among the respondents, almost three quarter (72.54 percent) appreciated the usage of mobile learning. 70 percent of these were unfavorable of mobile learning in the presurvey. When asked for more precise reasons for their views, the students reported convenience, reduction of conflicts of both time and space. Interestingly, negative comments still include work and study schedule conflicts. Students also criticized the fact that the learning materials were not specifically designed for mobile devices. Additional negative comments include general unwillingness to use mobile learning devices.

The students provided some feedback on how to improve the MLVLS. They suggested to increase opportunities for interaction (8 percent) and to design content specifically for the mobile devices (10 percent). About 8 percent stated that they would like to use additional tools for interacting with fellow students (especially Instant Messengers).

7.3 Postsurvey Results for the Advanced English Class

One hundred seventy-eight students responded to the course's postsurvey, which was performed at the end of the term, after the MVLS was introduced and used in class. The following is a descriptive statistical analysis of students' responses to those postsurvey questions relevant in the context of this paper. Although the study involves a relatively small sample, the Cronbachs Alpha value ( ${\rm n} = 178$ , ${\rm number\; of\; items} = 17$ , ${\rm a} = 0.9076$ ) indicates that the reliability of these responses exceeds the ${>}.90$ reliability standard.

Students were asked to rate their agreement to a set of statements on a scale of 1 (not agree at all) to 4 (completely agree). In summary, the mean ratings of students' satisfaction for most of the aspects of the mobile learning as performed in the class are high. The overall satisfaction was shown by a high mean of 3.61 for the question "I am satisfied with this class." According to the students, the mobile learning improved their mastery of English ("ML helped me a great deal in studying English," mean 3.57), including vocabulary ("ML helped me grasp the vocabulary," 3.19), and grammar ("ML helped me grasp the grammar," 3.43). Speaking proficiency ("ML helped me grasp the speaking/oral proficiency") received a slightly lower mean of 3.07, probably due to the fact that there was no direct audio communication channel from the virtually attending students to the teacher.

The way the learning material was presented in the MLVLS was highly appreciated by the students ("The modality of MLVLS (words, audio, and video) fits my learning style," 3.39).

The questions "I would like to recommend ML to other students" and "I would like to participate in future ML activities" received a high agreement of 3.5 and 3.56, respectively. This reveals that, general speaking, students are satisfied with the class.

Students were less satisfied by the SMS Interaction System (2.56), presumably due to the limited input capability of the mobile devices.

Students' feedback also included suggestions for improving how mobile learning was employed during the lecture. In general, students liked the class activities, and thus, suggested increasing their frequency. They also wished for after-class interactions through email or the class forum and additional types of interactions. With respect to the learning material, students suggested content closer to real life, addressing their needs when communicating in English. Due to the large individual differences in their educational background, some of the students complained that parts of the content were too difficult—a problem encountered in almost each class at Online-SJTU. Again, a few students suggested social features, such as being able to communicate with their classmates directly using the phone.

7.4 Discussion of the Findings

A conclusive interpretation of the result of the postsurveys is made difficult by the drop-off in participation in the survey, which is probably due it being voluntary. However, demographic information collected in the presurvey shows that the postsurvey participants form a representative sample of the college's student population. The collected data show that students appreciate the additional venues for participating in the class sessions, despite being skeptical in the beginning. With their mobile device, they are able to virtually attend the lecture even if no computer is at hand. However, negative comments still include work and study schedule conflicts. Even though the MLVLS allowed students to access the lectures from virtually everywhere, they still were required to have spare time. This shows that asynchronous access to the lectures is a necessity.

Obviously, students would appreciate presentations and learning activities more specifically designed for the mobile device. This illustrates two general dilemmas that mobile learning faces: the trade-off between technical constraints and visual quality of the learning material versus authoring costs. Mobile devices have a small screen resolution and are connected to the Internet through limited bandwidth. Videos, thus, have to be reduced in resolution and compressed. While the employed codecs were developed to preserve slide quality as far as possible, reduction and compression inevitably leads to loss of details. The problem of comprehension difficulties due to loss of details can be alleviated by designing additional learning materials specifically for the mobile devices. However, this is not always possible due to time and resource constraints. We, therefore, compiled a list of general design guidelines of slides, which, e.g., prescribe the minimal font size and limit the amount of text presented in each slide. These guidelines lead to improved readability on the mobile device (and better slides, in general, too). In addition, the limitation of the device (screen size) is addressed by the different streams (slide-view and lecturer close-up) delivered by the MLVLS.

As a result, the current system functions as an add-on to the standard lecture and does not require the instructors to perform additional work.

Taking into consideration the novelty of the system, the work schedule of the students and the need for a modern mobile phone, we think that the number of students who used the MLVLS is reasonable and encouraging.


The system design and development work reported in this paper was motivated by the goal of enabling access to education a larger amount of members of society and by the observation that mobile devices are significantly more widespread than desktop and laptop computers. We developed a pragmatic and cost-efficient solution: instructors give their presentations the same way as usual, and the developed technology takes care of broadcasting the content to the students' mobile devices. The class sessions are broadcast live, which allows for synchronous participation by the remote students using different modes of interaction specifically designed for easy usability with a mobile device.

The system was evaluated in two classes, with about 1,000 students each. The feedback was mostly positive. Students appreciated the additional way of access to the lectures. Since the students used the system throughout several weeks, the positive results cannot be attributed to a novelty effect.

However, several technical problems inhibit at this point in time the mainstream usage of the MLVLS at the Online-SJTU. Currently, if too many users access the broadcasting server simultaneously, the server, a single computer, becomes overloaded, and thus, unresponsive. This is a software problem that can be resolved by implementing a distributed architecture. In such an approach, incoming users are automatically assigned to one of several broadcasting servers, thus, reducing the overall load. Other problems due to the GPRS network, however, are outside of our control:

  • No/broken connection. GPRS connections are fragile and often break down or cannot be established at all.
  • Limited/changing bandwidth. Even though in principle, the bandwidth of the GPRS network is sufficient for our system, in practice the real bandwidth is often too low. Additionally, the bandwidth is not stable. Sudden changes in the connection result in dropped frames and distorted video and audio that can last several seconds.
  • Bandwidth decreases with the amount of users. Mobile phones are connected to the network via base transceiver stations. These stations create a wireless network split in different cells. The more GPRS users are in the same cell, the lower the GPRS bandwidth.
  • Phone calls decrease the connection speed. Since the phone calls and the GPRS connections are handled over the same mobile network, the GPRS bandwidth is affected by the mobile phone calls.

These technical shortcomings are inherent in the GPRS network. In practical usage, however, these theoretical shortcomings had little effect. In average, stress tests showed that about once per hour the connection was severely affected. In addition, these shortcomings will disappear once the network itself is upgraded. China is currently deploying the next generation mobile network (3G), which allows much higher connection speed and bandwidth. Once this network is available, the MLVLS will become an integral part at Online-SJTU.

Extending the system to other countries or regions that do not have the same sophisticated infrastructure would require effort but appears doable. Depending on the type of network available different streams could be broadcasted, for instance, audio only for limited connections. Currently, at Online-SJTU, two satellites broadcast a TV stream of the lectures to several universities in western China. In such a setting, audio and sound is transmitted via TV, while the mobile device could be used for interaction with the teacher. This is not yet implemented, however.

Furthermore, we are planning to enable free or very inexpensive public access to the lectures. In this way, even larger parts of society will profit from the educational materials. However, this will raise new questions that have to be solved beforehand, especially regarding interactivity. For instance, will the new audience be able to directly interact with the teacher the same the students of the college do or will there be different types of interactions? Additional questions we plan to investigate in further work concern new types student-student interaction that become possible by enabling students to communicate with each others directly.


The authors wish to thank Xiaoyan Pan, Wanping Gao, Minjuan Wang, and Jie Shen for their help with this work and to Inge de Waard for pointing them to related work. This Project was supported by China Postdoctoral Science Foundation (No. 20080430656).


About the Authors

Bio Graphic
Carsten Ullrich received the PhD degree in 2007 with a thesis on "Course Generation Based on HTN-Planning." Since March 2007, he has been a researcher at Shanghai Jiao Tong University (SJTU). From 2003 to 2007, he was a researcher at the DFKI (German Research Center for Artificial Intelligence) and co-project leader of the European FP6 project LeActiveMath. His research interests include technology-supported e-learning with a focus on personalization and learner-support. He has published numerous papers on adaptivity and Semantic-Web-based e-learning. Currently, he is the project leader in the EU FP7 project ROLE at SJTU.
Bio Graphic
Ruimin Shen has been a distinguished professor of computer science and engineering at Shanghai Jiao Tong University (SJTU) since 1991. He is the founder and director of the SJTU E-Learning Lab and the founding dean of Online-SJTU. Currently, he serves as a member of the Ministry of Education's Expert Committee on Long-Distance Education, where he is in charge of the regulation of Chinese E-Learning Technology Standards. In 2006, Premier Minister Wen Jiabao awarded Professor Shen with the National Award for Science & Technology Progress.
Bio Graphic
Ren Tong received the master's degree in 2005 from Shanghai Jiao Tong University (SJTU). He is a researcher and a senior software engineer at SJTU. His focus lies on mobile applications and won numerous awards. He is a Forum Nokia Champion, Accredited Symbian/S60 Developer, and was awarded the third price in the Forum Nokia OpenC Challenge 2007. His current interest includes mobile learning and multimedia.
Bio Graphic
Xiahong Tan is a distinguished lecturer at Shanghai Jiao Tong University (SJTU) and NEC-SJTU who won an "Excellent Teacher" award in 2006 and a national award for excellent courseware in 2007. She started working toward the PhD degree at SJTU in 2007. She designed the new e-learning environment used at Online-SJTU and is doing research on mobile learning.
65 ms
(Ver 3.x)