The sheer number of registrants maxed out convention center space forcing organizers to close on-site registration.
Nearly half of all attendees came from industry, while the remaining attendees hailed from academia and elsewhere. The top three fields represented were manufacturing, services, and education. Nearly a third of all attendees came from companies with 10,000 employees or more. A whopping 63 percent of attendees were between the ages of 18 and 34.
It’s not surprising then that the first two days of the conference saw an abundance of packed rooms for the nearly 120 tutorials scheduled for CVPR. Topics covered a broad range of cutting-edge innovations and developments in computational imaging, visual recognition, image analysis, and deep learning.
The tutorial sessions revealed some of the top fields using computer vision technology today—robotics, virtual and augmented reality, healthcare, and autonomous vehicles. Among industry heavy-hitters was Baidu who boasts the largest autonomous vehicle open source platform in the world.
Check out the highlights below.
“Apollo: Open Autonomous Driving Platform”
Dr. Tae Eun Choe, who leads the perception team at Baidu, gave the audience a glimpse into Baidu’s Apollo, the largest open autonomous driving platform in the world, with a full stack of hardware and software developed by the autonomous driving community.
Baidu attracted applicants to its three competitions at CVPR2019 with prize money totaling 6,300 USD. Competitors were asked to most accurately predict the presence of “traffic-agents”—bicycles, pedestrians, and other vehicles—using three different datasets.
“Perception at Magic Leap”
Magic Leap provided listeners with a deep dive into the four main ingredients in creating an immersive spatial computing platform: head pose tracking, world reconstruction, eye tracking, and hand tracking.
“We blend tech, biology, and creativity to reveal new worlds within our world,” says Magic Leap.
“Learning Representations via Graph-structured Networks”
Recent years have seen a dramatic rise in the adoption of powerful convolutional neural networks for heavy-duty computer vision tasks. However, these networks don’t adequately model several computer vision properties for more difficult AI tasks: pairwise relation, global context, and processing irregular data beyond spatial grids.
The answer, say researchers, is to reorganize the data to be processed with graphs according to the task at hand while constructing network modules that relate and propagate information across the visual elements within the graphs.
“We call these networks with such propagation modules graph-structured networks. We introduce a series of effective graph-structured networks, including non-local neural networks, spatial propagation networks, sparse high-dimensional CNNs, and scene graph networks,” the researchers say.
Above are the seven featured speakers for the “Learning Representations via Graph-structured Networks” tutorial.
“Towards Relightable Volumetric Performance Capture of Humans”
“Detect, Reconstruct, Track and Parameterize Humans” was just one of the talks held in morning and afternoon sessions that walked the audience through the ins and outs of building, from the ground up, a volumetric capture pipeline for reconstructing, tracking, and texturing of humans in.
“Volumetric (4D) performance capture is fundamental for AR/VR content generation,” says the research team.
Lori Cameron is Senior Writer for IEEE Computer Society publications and digital media platforms with over 20 years of technical writing experience. She is a part-time English professor and winner of two 2018 LA Press Club Awards. Contact her at email@example.com. Follow her on LinkedIn.
About Michael Martinez
Michael Martinez, the editor of the Computer Society’s Computer.Org website and its social media, has covered technology as well as global events while on the staff at CNN, Tribune Co. (based at the Los Angeles Times), and the Washington Post. He welcomes email feedback, and you can also follow him on LinkedIn.