Rear-screen and kinesthetic vision 3D manipulator
© The Author(s). 2017
Received: 16 June 2016
Accepted: 22 May 2017
Published: 15 June 2017
Skip to main content
We’re sorry, something doesn't seem to be working properly.
Please try refreshing the page. If that doesn't work, please contact us so we can address the problem.
© The Author(s). 2017
Received: 16 June 2016
Accepted: 22 May 2017
Published: 15 June 2017
The effective 3D manipulation, comprehension, and control of 3D objects on computers are well-established lasting problems, which include a display aspect, a control aspect, and a spatial coupling between control input and visual output aspect, which is a debatable issue. Most existing control interfaces are located in front of the display. This requires users to imagine that manipulated objects that are actually behind the display exist in front of the display.
In this research, a Rear-Screen and Kinesthetic Vision 3D Manipulator is proposed for manipulating models on laptops. In contrast to the front-screen setup of a motion controller, it tracks a user’s hand motion behind screens, coupling the actual interactive space with the perceived visual space. In addition, Kinesthetic Vision provides a dynamic perspective of objects according to a user’s sight, by tracking the position of their head, in order to obtain depth perception using the “motion parallax” effect.
To evaluate the performance of “rear-screen interaction” and Kinesthetic Vision, an experiment was conducted to compare the front-screen setup, the rear-screen setup with Kinesthetic Vision, and the rear-screen setup without it. Subjects were asked to grasp and move a cube from a fixed starting location to a target location in each trial. There were 20 designated target locations scattered in the interactive space. The moving time and distance were recorded during experiments. In each setup, subjects were asked to go through five trial blocks, including 20 trials in each block. The results show that there are significant differences in the moving efficiency by repeated measures ANOVA.
The Rear-Screen and Kinesthetic Vision setup gives rise to better performance, especially in the depth direction of movements, where path length is reduced by 24%.
3D computer graphics technology allows people to display 3D models on computers. As the technology advances, it has become widely used in various industries including animation, gaming, and computer-aided design. However, the limitations of display and control devices still introduce difficulties when comprehending and interacting with 3D models. Further, the spatial coupling between a perceived visual location and a manipulating location of models is still a debatable issue.
The first issue is the two-dimensional limitation of display devices. Although models are in three dimensions, it still takes efforts to present them stereoscopically. To make models “pop out” of screens, 3D viewers commonly use the technique of presenting two offset images separately in different eyes, requiring extra head-worn devices (Eckmann 1990). Another way to enhance stereoscopic perception is by using “motion parallax” effects, which is the relative displacement of viewed models by changing observers’ positions (Rogers and Graham 1979). On the other hand, Projection Augmented Model utilized a physical model, which is projected with computer images. This method present 3D models in a realistic looking. However, the pre-defined geometry shape and high precision of objects tracking and projecting is required (Raskar et al. 1998).
The second issue is the limitation of control devices. Dominant 2D input devices, which allow fine control of two-dimensional motion, are inappropriate for 3D manipulating due to the limited number of degrees-of-freedom (DoF). As a result, a mouse with virtual controllers for 3D manipulating has been discussed and evaluated in conjunction in several previous studies (Chen et al. 1988) (Khan et al. 2008). To overcome the limited DoF, controllers with three or more DoF are also developed for enhancing usability in 3D interactions (Hand 1997).
The last issue is coupling between control input and visual output spaces. Humans process visual cues received from eyes and proprioception from hands guide the movements of hands to reach and grasp models; this is called eye-hand coordination (Johansson et al. 2001). Good eye-hand coordination can reduce the mental burden during manipulation. However, most motion controllers decouple the perceived visual space (which is behind the display) and interactive space of models in front of the display (so called “front-screen” in the following chapters). Some people consider that, although this method follows the usual method of computer use, it may separate eye-hand coordination. Users’ brains need to make a semi-permanent adjustment of the spatial coupling between these spaces (Groen and Werkhoven 1998). This adaptation leads to negative after-effects of eye-hand coordination (Bedford 1989). To discuss these issues, some related works about spatial coupling problems are reviewed in the next section.
In previous research, there have been two kinds of interaction methods to solve the problem of spatial coupling.
Head-mounted displays (HMD) immerse users in the virtual environment. As a result, all visual perception of space is virtual, and the coupling problem no longer exists. HMD are widely used in virtual environment navigation. Newton et al. proposed the Situation Engine, which combines simulated environment with HMD and gestural control, to provide a hyper-immersive construction experience (Newton et al. 2013). However, the disadvantage is that it is relatively expensive, and it is not appropriate for extended use because it can cause dizziness and there is a need to coordinate between the virtual space and real input space (Hall 1997). Also, it focuses on large-scale 3D environment exploration rather than the manipulation of models.
Another method is to “partially” bring users into a virtual environment. The method combines Augmented Reality (AR) technologies, which fuses virtuality and reality, and the rear-screen setup, which makes users enter the fused and interactive environment visually by placing it at the back of displays. Kleindienst invented a viewing system for object manipulation, by coinciding the manipulation spaces as well as the real and virtual spaces in the viewing device (Kleindienst 2009). Holodesk, combining the optically transparent display with a Kinect camera for sensing hand motion, makes users interact with 3D graphics directly (Hilliges et al. 2012). Using the same concept, SpaceTop with the transparent OLED display is a desktop workspace that makes it easy for users to interact with floating elements on the back of the screen (Lee et al. 2013). The rear-screen idea is also brought to touch-screen devices for preventing fat-finger problems (Baudisch and Chu 2009).
In this contribution, we emulate a “rear-screen” using a laptop and a motion controller, which is not required special devices and able to be set up simply, and compare between “rear-screen” and “front-screen” tasks to validate the superiority of rear one in term of the efficiency and fatigue, due to the spatial coupling.
In this research, we proposed the rear-screen and kinesthetic vision 3D manipulator with a simple physical setup. Users are able to manipulate 3D models behind computer screens. Using the proposed method, the “Real Space Virtual Reality” makes the perceived virtual space and real interactive space coincident. We introduce the details of the research in this section, which is divided into the input and output modules: Rear-Screen Interaction and Kinesthetic Vision.
xV, yV, zV is the position of the virtual eyes and xA, yA, zA is the position of the actual eyes. The coordinate origin is at the center of the screen and the near plane. WV is the width of the near plane, and WA is the width of the screen view. HV is the height of the near plane, and HA is the height of the screen view. DV is the distance from of the virtual eye coordinates origin to the near plane center, and DA is the distance from of the actual eye coordinates origin to the screen center.
The Rear-Screen Interaction and Kinesthetic Vision will be further introduced in this section by dividing into three parts: physical setup, software setup, and demonstration.
The Unity game engine is chosen to construct the game environment, developed in C#. OpenCV libraries are used to implement the mark tracking function, and are integrated with Leap Motion API.
This section will introduce the experimental method for performance evaluation, including experiment procedures, participants, and performance measurement methods.
We set three conditions to compare the performance of our rear-screen setup and standard setups: Rear-Screen Interaction with Kinesthetic Vision (RIK), Rear-Screen Interaction (RI), and Front-Screen Interaction (FI) (Fig. 6). By comparing RIK and RI, we attempt to ascertain if the motion parallax effect is effective for depth perception. Likewise, RI is compared with FI to confirm the superiority of rear- to front-screen in eye-hand coordination.
We recruited 12 participants for the experiments. All participants are male and ranged from 22 to 25 years of age. The participants are right-handed and have normal vision. They were also required to have at least 6 months’ experience using software with 3D models manipulation functions, such as SketchUp, Revit, and Unity3D.
Phase I: Introduction and Preliminary Practice
First, users are introduced the overview of the experiment, including the physical setup and the software setup. Then, participants are required to practice the grab, release and move actions. The most important aim of this section is to make the user familiar with the setup and control device, avoiding subjective factors.
Phase II: Formal Test: Moving Objects
Users are asked to grab and move a green cube (starting position) to a red cube (target position) in a trial (Fig. 7). The interaction depth is about 60 cm. Starting and target positions are coupled beforehand to avoid in-condition variance with random orders. Five yellow cubes appear in random positions to avoid temporary position memory.
Each user has to conduct three sets of tasks according to the three aforementioned conditions. Each set of tasks are divided into 5 blocks and each block contains 20 trials.
Phase III: Formal Test: NASA-TLX
Last, participants conduct the NASA Task Load Index (NASA-TLX) (Hart and Staveland 1988), coupled with the fatigue scale and the overall scale after each set.
Each condition takes about 30 min, including rest time between each block for fatigue prevention. After the quantitative test, we interview users about their impressions to obtain qualitative results.
Speed: The task completion time is divided into 2 periods: the object acquisition time and the object moving time. The measurement of the acquisition time is triggered once the virtual hand is visualized, and ends once the user grabs the object. The moving time is triggered once the user grabs an object, and ends once the object reaches the target location and the space bar is subsequently pressed.
Accuracy: When the user presses the space bar, the distance between the centers of the object and the target is measured.
Ease of learning: We compare the performance between blocks of trials to evaluate whether the user improves by measuring the slope of the regression line between blocks of trials.
Fatigue: We reference the scaling of NASA-TLX to rate the fatigue.
Coordination: The ratio between actual trajectory length and the most efficient trajectory length is measured. In our design, the most efficient trajectory is the straight-line distance between two objects. The lengths in the x, y, and z-directions are also recorded.
Significance of usability aspects by repeated measures ANOVA
Ease of Learning
Figure 8. Grab time for Front-Screen Interaction (FI), Rear-Screen Interaction (RI) and Rear-Screen Interaction with Kinesthetic Vision (RIK) Variants of the rear-screen and kinesthetic vision 3D manipulator. Error bars represent +/- SEM (Standard Errors of the Mean.)
Figure 9a and b. Coordination ratios across the Front-Screen Interaction (FI), Rear-Screen Interaction (RI) and Rear-Screen Interaction with Kinesthetic Vision (RIK) Variants of the rear-screen and kinesthetic vision 3D manipulator: (a) Coordination in all directions; (b) Coordination in the Z-direction. (Error bars represent +/- SEM.)
Surprisingly, the object move time shows no significant difference between the three conditions (p > 0.01). We observed that movement speed varies according to personal habits.
From users’ feedback in the interviews, we learned users are prone to be distracted by the virtual and actual hands in the FI setup. As a result, the user finds it difficult to explore in the depth direction, leading to less efficient trajectories.
Design Review (DR) is a critical control point throughout the product development process to evaluate a design against its requirements. By combining of CAD and VR techniques, Digital or Virtual Prototyping allows to advance decisions in the early review phase to save time and cost (Bullinger et al. 2000). The review process of digital models requires several rounds of 3D manipulation in order to comprehend a design in sufficiently great detail. As the results, depth perception and eye-hand coordination are crucial for efficient exploring in a 3D virtual environment.
Eye-hand coordination, i.e. visuomotor coordination, plays an important role in playing video or computer games (Spence and Feng 2010). Players must respond accurately and quickly to visual information. Coupling between virtual and real spaces reduces the extra effort required for spatial adaption, enhancing user experiences in gaming.
Taking the advantage of eye-hand coordination ability in our design, the setup is potential to be developed into training or testing tools. In the previous research, a VR-based surgical simulator is validated that it is able to differentiate between different eye-hand coordination skills (Yamaguchi et al. 2007).
We propose a rear-screen and kinesthetic vision 3D manipulator, which is a novel 3D object manipulation method with a simple setup. Users are allowed to interact with a virtual object directly behind the screen. The components of the rear-screen and kinesthetic vision 3D manipulator are described and implemented in this research. Finally, experiments are conducted to evaluate the design.
The experimental results show there is a significant difference in coordination in the z-direction between FI, RI and RIK. Therefore, objects whose trajectory is in the depth direction are more efficiently manipulated using the rear-screen and kinesthetic vision 3D manipulator than using the standard setup. In general term, the kinesthetic sense improves users’ depth perception. The finding shows the possibility and value of installing sensors for use in the design review and gaming domains.
No funding to declare.
HWY, THW and CCY did the literature review and drafted the manuscript together. CCY developed the system, implemented and analyzed the validation experiment. SCK was the adviser and proof-read the article. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.