2019 Annual Report

This research project aims to endow robots with the ability to visually sense the geometry of a potentially uncertain and changing environment and to recognise objects and regions – for example, to build a high-level, dynamic map of the world. The ability to form this kind of high-level representation of the world is fundamental to safe and effective interaction of a robot within its environment.

Team Members

Niko Sünderhauf

Queensland University of Technology (QUT), Australia

Dr Niko Suenderhauf is a Chief Investigator of the Australian Centre for Robotic Vision, and a Senior Lecturer at Queensland University of Technology (QUT) in Brisbane, Australia (a Senior Lecturer is roughly equivalent to a junior Associate Professor in the US system).

Niko conducts research in robotic vision, at the intersection of robotics, computer vision, and machine learning. His research interests focus on scene understanding and how robots can learn to perform complex tasks that require navigation and interaction with objects, the environment, and with humans.

Visit Profile

Tat-Jun Chin

University of Adelaide, Australia

Tat-Jun Chin received his PhD in computer systems engineering from Monash University in 2007, which was supported by the Endeavour Australia-Asia Award. He is currently an Associate Professor at The University of Adelaide, and a Chief Investigator of the Australian Centre for Robotic Vision (ACRV), and the Director for Machine Learning for Space at The Australian Institute for Machine Learning (AIML). Tat-Jun is an Associate Editor of the IPSJ Transactions on Computer Vision and Applications (TCVA) and Journal of Imaging (J. Imaging). Tat-Jun’s research interest lies in optimisation for computer vision and machine learning, and their application to robotic vision, space engineering and smart cities. He has published more than 90 research articles on the subject, and has won several awards for his research, including a CVPR award (2015), a BMVC award (2018), two DST Awards (2015, 2017), and an Best of ECCV (2018) special issue invitation.

Visit Profile

Yasir Latif

University of Adelaide, Australia

Yasir Latif did his bachelors at Ghulam Ishaq Khan Institute of Engineering Science and Technology in  Topi, Pakistan and his master in Communication Engineering from Technical University of Munich (TUM), Germany. After that, he pursued his PhD at University of Zaraogoza, Spain under the supervision of Prof. Jose Neira. He visited Imperial College London and Massachusetts Institute of Technology for short research stays during that period. The main theme of his doctoral thesis was reliable loop closure detection and verification for the Simultaneous Localization and Mapping (SLAM) problem. His interests include SLAM, Computer Vision and looking for the ultimate question.

Visit Profile

Saroj Weerasekera

University of Adelaide, Australia

Saroj started as a PhD researcher at the University of Adelaide, supervised by Chief Investigator Ian Reid and Research Fellow Ravi Garg. His research interests lie at the intersection of visual 3D reconstruction, semantic scene understanding, and deep learning. His PhD research primarily explored the benefits of deep learning on top of standard geometric models for visual 3D reconstruction. He is continuing his work in 2018 as a Research Fellow based at the University of Adelaide.

Visit Profile

Pulak Purkait

University of Adelaide, Australia

Pulak received a PhD in computer science from the Indian Statistical Institute (ISI), Kolkata, India, in 2014. He was a postdoctoral researcher with the University of Adelaide, from September 2013 to February 2016 and again from September 2018. He has spent two years (2016-2018) at Toshiba Research Europe, Cambridge UK before joining back at the University of Adelaide. His research interests include image processing, computer vision and machine learning. He joined the Centre in 2018 and is currently leading a project on 3D scene graph generation.

Visit Profile

Ravi Garg

University of Adelaide, Australia

Ravi Garg and Associated Research Fellow with our Centre and is part of the Australian Centre for Visual Technologies at The University of Adelaide as senior research associate since April 2014. He is working with Prof. Ian Reid on his Laureate Fellowship project named “Lifelong Computer Vision Systems”. Prior to joining University of Adelaide, he finished his PhD from Queen Mary University of London under the supervision of Professor Lourdes Agapito where he worked on Dense Motion Capture of Deformable Surfaces from Monocular Video.

His current research interest lies in building learnable systems with little or no supervision which can reason about scene geometry as well as semantics. He is exploring how far the visual geometry concepts can help current deep neural network frameworks in scene understanding.

Visit Profile

Kejie ‘Nic’ Li

University of Adelaide, Australia

Kejie graduated from ANU with a Bachelor of Advanced Computing (Honours) with first class honours in 2016. During this time, he mainly worked on single view depth estimation. He joined the Centre in 2017 to work on semantic scene understanding, and the intriguing yet challenging task of building robots that are able to better interact with the world by well understanding the concept of objects and environment.

Visit Profile

Huangying Zhan

University of Adelaide, Australia

Huangying is currently a Ph.D. Student at the University of Adelaide and affiliated with the Australian Centre for Robotic Vision. He is advised by Prof. Ian Reid and Prof. Gustavo Carneiro. His research interests include deep learning and its application in robotic vision. Previously, Huangying received his B.Eng degree in Electronic Engineering (first class honors) from The Chinese University of Hong Kong (CUHK), where he was advised by Prof. Xiaogang Wang. Also, Huangying was a visiting student in the Unmanned Systems Research Group at The National University of Singapore, where he worked with Prof. Ben M. Chen.

Visit Profile

Lachlan Nicholson

Queensland University of Technology (QUT), Australia

Lachlan graduated from QUT in 2016 with First Class Honours in a Bachelor of Electrical Engineering. Whilst completing his degree, he was appointed by the Centre to continue the mechanical and software upgrade of the SummitXL mobile robot as a summer research task. He also worked with the ACRV to complete his undergraduate thesis with a focus on Navigation, Object Detection, and Mobile Manipulation within an office environment. With a team from the Centre of Excellence he competed in the Amazon Picking Challenge of 2016, achieving 6th place in the final demonstration held in Leipzig, Germany. Lachlan is currently pursuing his PhD with the centre and his research is focused on Scene Understanding via Deep Learning, Semantics and SLAM.

Visit Profile

Mina Henein

Australian National University (ANU), Australia

Mina joined the Australian National University and the Australian Centre of Excellence for Robotic Vision in March 2016 as a PhD candidate to work on SLAM in dynamic environments. He is doing research under the supervision of Viorela Ila and Robert Mahony. His research interests include graph-based SLAM, dynamic SLAM and object SLAM besides kinematics and optimization techniques.

Mina received a B.Sc. in Engineering and Materials Science with Honours majoring in Mechatronics from the German University in Cairo (GUC), Egypt in 2012. He then worked in the business sector for a multinational FMCG for one year as a Near-East demand manager before pursuing his masters in Advanced Robotics. He received a double M.Sc. degree; European Masters of Advanced Robotics (EMARo) from Universita degli Studi di Genova, Italy and Ecole Centrale de Nantes, France. T

Visit Profile

Jun Zhang

Australian National University (ANU), Australia

Jun received his Masters of Engineering and Bachelor of Engineering degrees in the School of Aeronautics of Northwestern Polytechnical University, China. During his Masters degree, Jun spent one and half years at the Institute of Computer Science and Technology, Peking University as a visiting researcher. His research interests include non-static visual SLAM, 3D shape analysis and retrieval and deep learning.

Visit Profile

Natalie Jablonsky

Queensland University of Technology (QUT), Australia

Natalie completed her Bachelor of Information Technology, majoring in Computer Science, from Deakin University in 2017. She started her PhD with the Centre at QUT under the supervision of Niko Sünderhauf and Michael Milford. Natalie’s research focused on modelling spatial relationships for semantic SLAM.

In 2019, Natalie left QUT to start in the role of Principal Data and Plaforms Engineer (Enterprise Data) at BHP in Brisbane.

Visit Profile

Mehdi Hosseinzadeh

University of Adelaide, Australia

Mehdi was a PhD researcher in computer vision at the University of Adelaide under the supervision of Chief Investigators Ian Reid and  Anton van den Hengel. He obtained his bachelor degree in Electrical Engineering in 2009 and his masters degree in Control Systems in 2013. His research interests are semantic visual SLAM, probabilistic graphical models and machine learning in robotic vision applications.

Mehdi completed in PhD in 2019.

Visit Profile

Sourav Garg

Queensland University of Technology (QUT), Australia

Sourav obtained his Bachelors in Electronics and Communication from Thapar University, India in 2012. After graduating, he worked in the robotics research group of Tata Consultancy Services (TCS) for 3 years where he conducted research in the field of robotic and computer vision. In particular, he was involved in projects like: human and object tracking, product counting in a retail shop environment, and a tea-serving robot in an office environment.

Motivated to delve deeper into robotic vision, Sourav commenced his Ph.D. at QUT in 2015. His thesis title is “Robust Visual Place Recognition under Simultaneous Viewpoint and Appearance Variations”. His Ph.D. research explored ways to exploit visual semantic information, 3D geometry, and deep-learnt CNNs for visual place recognition. Sourav’s thesis was supervised by Professor Michael Milford (Principal) and Dr. Niko Suenderhauf (Associate).

Visit Profile

Shin Fang Ch’ng

University of Adelaide, Australia

Shin joined the Centre as a PhD researcher in 2017 under the supervision of Tat-Jun Chin and Alireza Khosravian. She graduated from Sheffield Hallam University UK with first class honours in Electronics Engineering in 2012. Shin’s research interest lies in computer vision and its application.

Visit Profile

Jiawang Bian

University of Adelaide, Australia

Jiawang is currently a PhD researcher at the University of Adelaide and an Associated PhD researcher with the Centre. He is advised by Prof. Ian Reid and Prof. Chunhua Shen. His research interests lie in the field of computer vision, machine learning, and robotics. Jiawang received his B.Eng degree from Nankai University, where he was advised by Prof. Ming-Ming Cheng. He was a research assistant at the Singapore University of Technology and Design (SUTD), where he worked with Prof. Sai-Kit Yeung. Jiawang also worked as a trainee research engineer at the Advanced Digital Sciences Center in Singapore (ADSC), Huawei Technologies Co., Ltd, and Tusimple.

Visit Profile

Anh-Dzung Doan

University of Adelaide

Visit Profile

Project Aim

As a robot moves around it needs to develop an understanding of its environment – the geometry, the objects present, the free-space, and the potential ways it can interact with objects and other elements of the environment (affordances).

This research project aims to endow robots with the ability to visually sense the geometry of a potentially uncertain and changing environment and to recognise objects and regions – for example, to build a high-level, dynamic map of the world. The ability to form this kind of high-level representation of the world is fundamental to safe and effective interaction of a robot within its environment.

The project uses visual sensing (cameras) to create geometric and semantic models and representations of the environment. This enables a robot to reason about a scene so that it can plan actions effectively.

The project combines work on mapping an environment geometrically with work on understanding images and video in terms of their constituent semantic parts – the objects and regions in the scene. This has been tackled within the project in two main ways: developing new methods for performing visual localisation and mapping (building a map of the environment and working out where the robot is within that environment); and developing methods that leverage the power of deep learning to understand images and video.


Key Results

In 2019 Centre PhD Researcher Mehdi Hosseinzadeh and Research Fellow Yasir Latif created a real-time system for performing “Object-based SLAM”, in which the system builds a map of the environment that comprises a sparse point-cloud (like much other work) but augments this with objects and surfaces. This built on earlier work that Dr Hosseinzadeh and fellow Centre PhD Researcher Lachlan Nicholson had done in representing and tracking 3D objects. A focus in 2020 is to develop this system further to incorporate dynamic objects (moving cars or people), using ideas worked on by Centre PhD Researchers Mina Henein and Jun Zhang.

The project team has developed world-leading methods that fuse geometry with deep learning. For example, PhD Researcher Huangying Zhan and Research Fellow Saroj Weerasekera developed a method that can perform very accurate visual odometry (how far and fast has the robot moved). They achieved this by combining deep learning to establish the motion flow field between a pair of images in a video, with traditional geometric methods to extract the relative motion for the flow field.

The team has also shown how we can use geometry to self-supervise a method for predicting the depth of a scene from a single view. This was integrated into an end-to-end system that self-learns to compute a 3D map and 6D trajectory of a robot.


Activity Plan for 2020

  • Incorporate ideas from the Learning research project on representing uncertainty in deep models, to enable fusion of local dense geometric maps.
  • Integrate dynamic models of motion into the project’s Object-based SLAM system, and use this for effective planning in the face of dynamic change.
  • Develop two open-source demonstrators of the project’s geometric and semantic scene understanding capability: 1) object-based SLAM in a dynamic environment; and 2) end-to-end self-trained visual odometry and dense mapping SLAM.