Amazon and Academic Researchers Develop Extrasensory App to Better Identify What You’re Doing Where. Their Biggest Dataset Yet Could Advance Health Tech

By Lori Cameron and Michael Martinez
Published 12/12/2017
Share this on:

Amazon and academic researchers have devised a data collection app using Apple and Android devices to analyze walking, sitting, running, music listening, and TV watching—or the context of human behavior “in the wild”—whose findings that could help health monitoring.

“To the best of our knowledge, this dataset, which is publicly available, is far larger in scale than others collected in the field” of  in-the-wild data collection, write the Amazon and University of California, San Diego, authors of “Recognizing Detailed Human Context in the Wild from Smartphones and Smartwatches,” (login may be required for full text) a peer-reviewed research paper that appears in the October-December 2017 issue of IEEE Pervasive Computing.

The researchers call their app ExtraSensory, which is publicly available for free (with full source code and a user-researcher guide) as well as the ExtraSensory Dataset.

“In this work, we used smartphone and smartwatch sensors to collect labeled data from over 300,000 minutes from 60 subjects. Every minute has multisensor measurements and is annotated with relevant context labels,”  the authors write. They are Yonatan Vaizman, a doctoral candidate at the University of California, San Diego; Amazon research scientist Katherine Ellis who was based at UC San Diego when co-writing the article; and Gert Lanckriet, principal applied scientist at Amazon and a professor of electrical and computer engineering at UC San Diego.

The technology could have broad applications, especially for the automated monitoring of an individual’s health.

“The ability to automatically recognize a person’s behavioral context (including where they are, what they’re doing, and who they’re with) is greatly beneficial in many domains. Health monitoring applications have traditionally been based on manual, subjective reporting, sometimes using end-of-day recalling. These applications could be improved with the automatic (frequent, effortless, and objective) detection of behaviors, especially behaviors related to exercise, diet, sleep, social interaction, and mental states (such as stress),” the authors write.

A ‘novel dataset’

The research shows how multimodal sensors can help address the challenges presented by in-the-wild conditions.

“Our novel dataset reveals behavioral variability in the wild that was underrepresented in controlled studies, but we demonstrate how sensing modalities can complement each other and how fusing them helps resolve contexts that arise with uncontrolled behavior,” they write.

Not all data collection apps are easy to use.

Some require that patients enter data at the end of the day or at random times. Others require that patients always carry their phone in their pocket. Still others require the use of a completely different device that users might find awkward and unfamiliar.

How the context recognition system works

To solve these problems, researchers developed ExtraSensory, which is convenient and unobtrusive, yet, interprets behavior and contexts seamlessly—all from the convenience of your smart phone.

context-recognition system for human behavior in the wild
Our context-recognition system. (a) While a person is engaged in natural behavior, the system uses sensor measurements from the smartphone and smartwatch to automatically recognize the person’s detailed context. (b) Single-sensor classifiers: Appropriate features are extracted from each sensor. For a given context label, classification can be done based on each sensor independently. (c) In early fusion (EF), features from multiple sensors are concatenated to form a long feature vector. (d) Late fusion using averaging (LFA) simply averages the output probabilities of the single-sensor classifiers. (e) Late fusion with learned weights (LFL) learns how much to “listen” to each sensor when making the final classification.

“These [earlier] applications could be improved with the automatic (frequent, effortless, and objective) detection of behaviors, especially behaviors related to exercise, diet, sleep, social interaction, and mental states (such as stress). Aging-care programs could use automated logging of older adults’ behavior to detect early signs of cognitive impairment, monitor functional independence, and support aging at home,” the authors write.

How ExtraSensory data collection app looks on iPhone

The research method had four distinguishing characteristics:

  • Naturally used devices—subjects used their own personal phones and a smartwatch.
  • Unconstrained device placement—subjects were free to carry their phones in any way convenient to them.
  • Natural environment—subjects collected data in their own regular environment for approximately one week.
  • Natural behavioral content—no script or tasks were given, nor specific set of activities targeted. Instead, the context labels came from the data—the subjects engaged in their routines and selected any relevant labels (from the large menu) that fit what they were doing.
ExtraSensory App for iPhone
Screenshots from the ExtraSensory App (iPhone version): (a) the history tab, which shows a daily log of activities and contexts as well as predictions (shown with question marks); (b) the label selection menu, indexed by topics and a “frequently used” link; (c) the “active feedback” page, which lets the user report that he or she will be engaged in a specific context, starting immediately and valid for a selected time period; and (d) periodic notifications, which remind the user to provide labels.

Unfortunately, the unscripted behavior and unconstrained phone usage resulted in situations that were harder to recognize.

How the ExtraSensory app detects sleep, showering, or a meeting

The authors show how combining multimodal sensors can fix the problem.

“The performance of single-sensor classifiers on selected labels demonstrates the advantage of having sensors of different modalities. As expected, for detecting sleep, the watch was more informative than the phone’s motion sensors (accelerometer and gyroscope)—because the phone might be lying motionless on a nightstand, whereas the watch could record wrist movements. Similarly, contexts such as ‘shower’ or ‘in a meeting’ have unique acoustic signatures (running water, voices) that allowed the audiobased classifier to perform well. When showering, the phone would often be in a different room, leaving the watch as an important indicator of activity,” say the authors.

Sensors in the ExtraSensory data collection app
The sensors in the dataset. “Core” represents examples that have measurements from all six core sensors analyzed in this article (shown here in bold and italic font).

The authors hope other researchers will use the dataset to make even more improvements to data collection systems.

“The public dataset we collected provides a platform to develop and evaluate these methods, as well as explore feature extraction, inter-label interaction, time-series modeling, and other directions that will improve context recognition,” say the authors.


Research related to health monitoring in the Computer Society Digital Library:

(login may be required for full text)




About Lori Cameron

Lori Cameron is a Senior Writer for the IEEE Computer Society and currently writes regular features for Computer magazine, Computing Edge, and the Computing Now and Magazine Roundup websites. Contact her at Follow her on LinkedIn.


Michael Martinez


About Michael Martinez

Michael Martinez, the editor of the Computer Society’s Computer.Org website and its social media, has covered technology as well as global events while on the staff at CNN, Tribune Co. (based at the Los Angeles Times), and the Washington Post. He welcomes email feedback, and you can also follow him on LinkedIn.