Touchless interface works with a standard camera

Technology News |
By Nick Flaherty

Researchers in London have combined computer vision and machine learning to create a touchless interface that works with a standard camera.

MotionInput V3 was developed by academics and students at University College London’s (UCL) Computer Science department, in collaboration with Intel, Microsoft and IBM.

The software, available here under a non-commercial license, uses a common webcam to allow a user to control a PC by gesturing with their hands, head, face and full body or by using speech. The software analyzes these interactions and converts them into mouse, keyboard and joystick signals making full use of existing software. 

The initial focus is on healthcare applications, using hands or eyes simultaneously with speech. Every game can now be accessible, progress of patient movements can be recorded in physical therapy, and, in a hospital setting, surgeons can take notes through hand gestures and speech without having to touch a computer. The solution does not require connectivity or a cloud service, making it that much easier to deploy.  

The software employs a mix of machine learning and computer vision models to allow for responsive interaction and is customizable by the user with a variety of modules.

The facial navigation module allows the user to use their nose or eyes and a set of facial expressions to trigger actions like mouse button clicks, or with speech by saying “click.”, while a selection of hand gestures can be recognized and mapped to specific keyboard commands and shortcuts, mouse movements, native multitouch sensing, and digital pens with depth in the air. 

There is also an auto-calibration method for eye-tracking that obtains the gaze estimation, including both a grid mode and magnetic mode, and there is even full body tracking so that users can set physical exercises and tag regions in their surrounding space to play existing computer games.  

Speech hotkeys and live captions allow users to interact with the computer from a set of voice commands, live captioning and overriding keyboard shortcuts and users can play games with the usual ABXY joypad buttons in the air with analog trigger controls.  

“This idea was first proposed by the UCL team in the summer of 2020 as a series of UCL Computer Science IXN [Industry Exchange Network] student projects and stemmed from the need to help healthcare workers during COVID-19 when it was necessary to keep shared computers clean and germ-free,” said Phillippa Chick, global account director, Health and Life Sciences, Intel UK.

“It has great opportunity to positively impact the lives of people with chronic conditions that affect movement.”

“What makes this software so special is that it is fully accessible,” she said. “The code does not require expensive equipment to run. It works with any standard webcam, including the one in your laptop. It’s just a case of downloading and you are ready to go.”

Intel provides UCL with the OpenVINO  toolkit to ease the development of AI-based applications. The pre-trained models provided by OpenVINO enabled faster development of the various components and features of MotionInput, allowing students to move forward without training their own models.

The software engineering development and architecture for V3 was led by UCL students, Sinead Tattan and Carmen Meinson. Together they led over 50 UCL students on various courses at UCL computer science to build upon the work. The team also worked with mentors from Microsoft and IBM, notably Prof. Lee Stott and Prof. John McNamara. 

“The project will continue and is looking to collaborate with industry sectors. The academics and mentors are looking into what can be done to expand use cases and continuously improve the user experience,” said Chick. “We love working with the students and teaching staff at UCL, as it’s inspiring to see what they can do with technology.”;

Related articles 

Other articles on eeNews Europe



Linked Articles
eeNews Europe