Intuitive Controls with AR-based Gesture Recognition

Intuitive Controls with AR-based Gesture Recognition

Table of contents

The emergence of AR technology has allowed us to interact with our devices in a new and unexpected way. With regard to smart device development, from PCs to mobile phones and beyond, the process has been dramatically simplified. Interactions have been streamlined to the point where only slides and taps are required, and even children as young as 2 or 3 can use devices.

Rather than having to rely on tools like keyboards, mouse devices, and touchscreens, we can now control devices in a refreshingly natural and easy way. Traditional interactions with smart devices have tended to be cumbersome and unintuitive, and there is a hunger for new engaging methods, particularly among young people. Many developers have taken heed of this, building practical but exhilarating AR features into their apps. For example, during live streams, or when shooting videos or images, AR-based apps allow users to add stickers and special effects with newfound ease, simply by striking a pose; in smart home scenarios, users can use specific gestures to turn smart home appliances on and off, or switch settings, all without any screen operations required; or when dancing using a video game console, the dancer can raise a palm to pause or resume the game at any time, or swipe left or right to switch between settings, without having to touch the console itself.

So what is the technology behind these groundbreaking interactions between human and devices?

HMS Core AR Engine is a preferred choice among AR app developers. Its SDK provides AR-based capabilities that streamline the development process. This SDK is able to recognize specific gestures with a high level of accuracy, output the recognition result, and provide the screen coordinates of the palm detection box, and both the left and right hands can be recognized. However, it is important to note that when there are multiple hands within an image, only the recognition results and coordinates from the hand that has been most clearly captured, with the highest degree of confidence, will be sent back to your app. You can switch freely between the front and rear cameras during the recognition.

Gesture recognition allows you to place virtual objects in the user's hand, and trigger certain statuses based on the changes to the hand gestures, providing a wealth of fun interactions within your AR app.

The hand skeleton tracking capability works by detecting and tracking the positions and postures of up to 21 hand joints in real time, and generating true-to-life hand skeleton models with attributes like fingertip endpoints and palm orientation, as well as the hand skeleton itself.

AR Engine detects the hand skeleton in a precise manner, allowing your app to superimpose virtual objects on the hand with a high degree of accuracy, including on the fingertips or palm. You can also perform a greater number of precise operations on virtual hands and objects, to enrich your AR app with fun new experiences and interactions.

App Development

i. Check whether AR Engine has been installed on the current device. Your app can run properly only on devices with AR Engine installed. If it is not installed, you need to prompt the user to download and install AR Engine, for example, by redirecting the user to AppGallery. The sample code is as follows:

boolean isInstallArEngineApk =AREnginesApk.isAREngineApkReady(this);
if (!isInstallArEngineApk) {
    // ConnectAppMarketActivity.class is the activity for redirecting users to AppGallery.
startActivity(new Intent(this, com.huawei.arengine.demos.common.ConnectAppMarketActivity.class));
     isRemindInstall = true;
}

ii. Initialize an AR scene. AR Engine supports the following five scenes: motion tracking (ARWorldTrackingConfig), face tracking (ARFaceTrackingConfig), hand recognition (ARHandTrackingConfig), human body tracking (ARBodyTrackingConfig), and image recognition(ARImageTrackingConfig).

Call ARHandTrackingConfig to initialize the hand recognition scene.

mArSession = new ARSession(context);
ARHandTrackingConfig config = new ARHandTrackingconfig(mArSession);

iii. You can set the front or rear camera as follows after obtaining an ARhandTrackingconfig object.

Config.setCameraLensFacing(ARConfigBase.CameraLensFacing.FRONT);

iv. After obtaining config, configure it in ArSession, and start hand recognition.

mArSession.configure(config);
mArSession.resume();

v. Initialize the HandSkeletonLineDisplay class, which draws the hand skeleton based on the coordinates of the hand skeleton points.

Class HandSkeletonLineDisplay implements HandRelatedDisplay{
// Methods used in this class are as follows:
// Initialization method.
public void init(){
}
// Method for drawing the hand skeleton. When calling this method, you need to pass the ARHand object to obtain data.
public void onDrawFrame(Collection<ARHand> hands,){

    // Call the getHandskeletonArray() method to obtain the coordinates of hand skeleton points.
        Float[] handSkeletons  =  hand.getHandskeletonArray();

        // Pass handSkeletons to the method for updating data in real time.
        updateHandSkeletonsData(handSkeletons);

}
// Method for updating the hand skeleton point connection data. Call this method when any frame is updated.
public void updateHandSkeletonLinesData(){

// Method for creating and initializing the data stored in the buffer object.
GLES20.glBufferData(..., mVboSize, ...);

//Update the data in the buffer object.
GLES20.glBufferSubData(..., mPointsNum, ...);

}
}

vi. Initialize the HandRenderManager class, which is used to render the data obtained from AR Engine.

Public class HandRenderManager implements GLSurfaceView.Renderer{

// Set the ARSession object to obtain the latest data in the onDrawFrame method.
 Public void setArSession(){
 }
}

vii. Initialize the onDrawFrame() method in the HandRenderManager class.

Public void onDrawFrame(){
// In this method, call methods such as setCameraTextureName() and update() to update the calculation result of ArEngine.
// Call this API when the latest data is obtained.
mSession.setCameraTextureName();
ARFrame arFrame = mSession.update();
ARCamera arCamera = arFrame.getCamera();
// Obtain the tracking result returned during hand tracking.
Collection<ARHand> hands =  mSession.getAllTrackables(ARHand.class);
     // Pass the obtained hands object in a loop to the method for updating gesture recognition information cyclically for processing.
     For(ARHand hand  :  hands){
         updateMessageData(hand);
}
}

viii. On the HandActivity page, set a render for SurfaceView.

mSurfaceView.setRenderer(mHandRenderManager);
Setting the rendering mode.
mSurfaceView.setRenderMode(GLEurfaceView.RENDERMODE_CONTINUOUSLY);

Conclusion

Physical controls and gesture-based interactions come with unique advantages and disadvantages. For example, gestures are unable to provide the tactile feedback provided by keys, especially crucial for shooting games, in which pulling the trigger is an essential operation; but in simulation games and social networking, gesture-based interactions provide a high level of versatility.

Gestures are unable to replace physical controls in situations that require tactile feedback, and physical controls are unable to naturally reproduce the effects of hand movements and complex hand gestures, but there is no doubt that gestures will become indispensable to future smart device interactions.

Many somatosensory games, smart home appliances, and camera-dependent games are now using AR to offer a diverse range of smart, convenient features. Common gestures include eye movements, pinches, taps, swipes, and shakes, which users can strike without having to learn additionally. These gestures are captured and identified by mobile devices, and used to implement specific functions for users. When developing an AR-based mobile app, you will need to first enable your app to identify these gestures. AR Engine helps by dramatically streamlining the development process. Integrate the SDK to equip your app with the capability to accurately identify common user gestures, and trigger corresponding operations. Try out the toolkit for yourself, to explore a treasure trove of powerful, interesting AR features.