3Claws

De Ensiwiki
Aller à : navigation, rechercher
Project schedule.png
Titre du projet 3Claws, the 3 fingers mouse controller
Cadre Projets de spécialité

Encadrants François Berard, Laurence Nigay ,Celine Coutrix (Tutor)

Students

Subject

Introduction

Most part of the human population is using a touchpad or a mouse almost daily. These small devices, connected to our computer are really useful, but it remains one problem when they are used in addition to the keyboard: the homing time between these devices makes us waste a lot of time. The 3Claws project has been created for the AHCI MOSIG 2018-2019 Students’ Project by Danh-Chieu-Phu Huynh and Sébastien Riou. It's a new way of interaction provided to move the cursor on a computer screen by using only a camera.

Objectives

The main goal of this project is to avoid the homing time between the keyboard and the mouse/touchpad. The idea is to remove the maximum amount of time from the homing time. To do so the idea is quite simple, instead of going from one device to another, everything will now happen on the keyboard thanks to the cameras. We decided to use the camera, in order to detect 3 finger locations (Thumb, Index, and Middle finger).

The objective of this project is to provide a user-friendly mouse-keyboard device that will help the user using tools, when writing documents or filling a form in the fastest way for example, by avoiding the homing time of the usual mouse devices. In our actual society, most of the people have to use a computer, for administration tasks, writing documents etc... By saving even a quarter of a second for each action, it would lead to a good improvement, because most of the tasks people have to do can be really repetitive and boring.

State of the art

A lot of projects already tried to create a mouse controlled by hand gesture using a camera. The work of Hojoon Park [4] itself inspired from the work of Chu-Feng Lien [2] tends to implement a method for controlling mouse using a real-time camera. Unfortunately in addition to the fact that the final prototype was not stable because of the bad hand detection and hypothesis, this idea can't be applied in our context because it asked the user to remove his hand from the keyboard, the homing time will then be different than the one between keyboard and usual mouse, but it will exist anyway. In order to make our prototype more stable, we decided to use the tape idea of Abhik Banerjee, Abhirup Ghosh, Koustuvmoni Bharadwaj, Hemanta Saikia [3] even if the light would be a problem for future improvement, it would be a nice trade-off in order to test our human-centered technology. We also had the idea already presented by Kamran Niyazi, Vikram Kumar, Swapnil Mahe and Swapnil Vyawahare [5] using some threshold between the fingers distance to trigger click events.

All these researches have been realized using as first reference the “Computer vision based mouse”, Acoustics, Speech, and Signal [1] study.

Creation of the prototype

Prototype installation: (A) Top-Left Webcam (B) 3Claws interaction area

Hardware

To create the prototype of this project we chose to use a low-resolution webcam in order to be the fastest possible in term of image frame processing. The camera is placed on the top left of the screen and is oriented to see the keyboard from the top.

  • Top position is used to determine the x and y position of the 3 fingers (thumb, index, middle finger) above the keyboard.
  • Left is used to detect in an easier way the clicks by a modification of the distance between fingers.


We also used colored tape to make finger detection easier.


Tape used for finger detection


Software

In order to make the mouse move and click, we decided to code on Visual Studio using c++ language. The detection of colored tape on the fingers is done by using the OpenCV library. The following pseudo-code present you the skeleton of our algorithm:

3Claws_Action(){

	if ( (ActualDistance_Thumb_Index<= Thumb_Index_Threshold) && 
             (ActualDistance_Index_MiddleFinger <= Index_MiddleFinger_Threshold) && 
             (ActualDistance_MiddleFinger_Thum<= MiddleFinger_Thumb_Threshold)) 
        {
		if (!is_3Claws_Activated())
                {
                	set_3Claws_Activated(true);
                }

                //Move the cursor
	}
	else if  (ActualDistance_Thumb_Index<= Thumb_Index_Threshold) 
        {
		if (is_3Claws_Activated())
                {
                	//Emulate Right Click
                }
	}
	else if  (ActualDistance_MiddleFinger_Thum<= MiddleFinger_Thumb_Threshold))  
        {
		if (is_3Claws_Activated())
                {
                	//Emulate Left Click
                }
	}
	else {
		set_3Claws_Activated(false);
	}
}

Implemented features

Enable/Disable 3Claws

In order to activate the 3Claws option, we used a movement that looks « natural » for humans but uncommon when typing on a keyboard so that it’s not disturbing for the user but still easy to recognize. The 3 tracked fingers are activating the 3Claws when they are close enough by reaching a given threshold as you can see here:

3Claws enabled

The 3Claws will then be disabled if one of these fingers is going beyond this threshold and become too far from another one:

3Claws disabled

Move Cursor with 3Claws

When 3Claws is activated we can move the cursor easily just by keeping the same position with fingers and by moving the hand. The translation from the movement of the hand on the keyboard to the screen will be done in a direct space because we're thinking it should improve time performance, and also because, as the previous quoted papers showed, the detection signal can be lost easily so it would be disturbing in another way. If the detection was perfect, relative space would be also good to test even if we may need to deactivate and reactivate the 3Claws several times (such as with touchpad) because it could maybe be more intuitive and could avoid disturbing big movements.


Activation
Move after activation

Click with 3Claws

For the click, we tried to stick to the idea of the mouse and touchpad to stay more user-friendly and to be understood more easily. In order to detect the clicking events with 3Claws we are trying to compute the increase of distance between fingers. To use the right click the basic idea is to keep the thumb and middle finger together and to release the index finger.

Activated Mode Right CLick
Top view
3Claws Active.JPG
3Claws Right.JPG
Side View
3Claws ActiveSide.JPG
3Claws RightSide.JPG

In contrary, if we want to use the left click we need to keep the index and the thumb together and we just have to release the middle finger.

Activated Mode Left CLick
Top view
3Claws Active.JPG
3Claws Left.JPG
Side View
3Claws ActiveSide.JPG
3Claws LeftSide.JPG

The idea by using a camera on the top left side was to be sure that the click was really achieved by the user. With only a top position it would have been hard to make the difference between clicking or fast typing only.

User test case

Assumption

Thanks to this project the homing time between taping something on the keyboard and having interaction with the cursor on the screen should be significantly reduced. We are aware of the fact that our prototype is not perfect yet so probably the computations between the finger recognition and cursor actions can lead to error in the final results of our tests.

We were also thinking about the fact that the click will be a bit hard to assimilate for the user. With mouse and touchpad, we're used to put our finger down to click, with the 3claws we have to put it up. Unfortunately using such a gesture is not possible with our base position so we hope it will not disturb the user.

In our opinion, a perfect 3Claws implementation should be really stronger than touchpad and mouse when the task we want to manage effectively has a homing time but it should be less efficient if it is used as a substitution for tasks that have to be done using cursor only. In fact, the action we have to achieve when using 3Claws can be tiring compared to a mouse, that's why we should have to return to a normal position quite often.

We think that the final results should be better than the mousepads that are sometimes hard to use when a huge distance separate the cursor from our goal, but the fact that actual mouses are really fast and accurate makes us think that the final results will be hardly better than mouse ones.

User test population

For our tests, we based our study on people used to use a computer daily. Among the 6 people, 5 were master students from computer sciences, 1 was a master student from history. We decided to stop our tests after 6 user tests cases because the results were not relevant enough due to the instability of our solution.

Presentation of the tests

Test window

All the tests have been designed by using a basic window. Only the way we use this window will change depending on the test we want to achieve. In the window of this app there are 3 different things:

  • validation button
  • empty text field
  • text giving the word that has to be entered.

Each of the following tests will have to be done with the mouse, touchpad and finally 3Claws, and the window will move 5 times (= 5 validations in the charts).

The order of the tests and of the devices used vary to be sure that this order will not have an impact on the results.


Small example of the test1

Test1 : Random Input and Validation

The first test is designed in order to test our project in “real” condition. The basic idea is to ask the user to go to this window, enter the asked text inside the input field and to finally validate by clicking the button. The window will then move to another place and he/she will have to repeat the operation. The goal is to compare the action time between the 3 tested devices. This first test is the test that is the most similar to our idea of improvement in order to fill forms or write documents.

Test 2 : Fixed Input and Validation

In test 2, to avoid the errors on the reflexion time that could be due to an unexpected word, the same word have to be entered in the input during the whole test. Before testing, the user has to type it a few times just to be sure that he/she will learn this word before the test part and not during the test part.

Test 3 : Validation only

The third test consists only in clicking on the validate button, without typing the text. It will give the opportunity to test the system without having the need of disabling it so that the 3Clwas system will stay activated from the beginning to the end of the test. This test is realized in order to see if our assumption regarding the "cursor based" action that should be less efficient with 3Claws is verified.

Other Test suggestions

We were thinking about 2 other kinds of tests that we didn’t have time to implement.

  • The first one would be to go from one window to another without clicking (to test the move only)
  • The other one would be to test the 2 different clicks only (for example by using the right or left click depending on the color of the button).

Also, as mentioned in the beginning it would be interesting to compare mouse and touchpad not only with 3Claws in direct space but also with 3Claws in relative space.

Tests results

In all the following results, the error bars are associated with the standard deviation of our data. The x-axis is used to display the 5 different steps of our test (the 5 different positions of the window). The y-axis is used for the time in ms. Each bar displays the time from the beginning of the test to the corresponding validation click.

Mouse

3Claws MouseTest01.jpg
3Claws MouseTest02.jpg
3Claws MouseTest03.jpg

Touchpad

3Claws TouchPadTest01.jpg
3Claws TouchPadTest02.jpg
3Claws TouchPadTest03.jpg

3Claws

Due to the instability of our solution test1 was taking too much time and it was really hard for most of the users to finish it because of the fact that the camera loses the fingers or because it could be a bit painful for some users to use it during a too long time without relaxing her/his hand by releasing it on the keyboard. We so decided to do not make them pass the 2nd test because the reflexion time taking to understand the word to write would have absolutely no impact between test1 and 2 for our solution.

3Claws 3ClawsTest1.jpg
3Claws 3ClawsTest3.jpg

Comparison of the 3 Devices

Test1 Test2 Test3
Test1: Mouse and Touchpad comparison
Test2: Mouse and Touchpad comparison
Test3: Mouse and Touchpad comparison
Test1 Test3
Test1: Mouse, Touchpad and 3Claws comparison
Test3: Mouse, ouchpad and 3Claws comparison

Analysis of the results

Global Analysis

Test 1 shows that for basic use of the keyboard, touchpad becomes slightly better than the mouse. To confirm the fact that touchpad is more useful if it is used in addition to the keyboard we can have a look at the Test 2 results. The difference of time between TouchPad and Mouse is not that significant, but the mouse tends to become more efficient because of the fact that we can focus on the "moving" part of the action due to the fact the word asked by the window is always the same. We can then analyze it in combination with Test 3 and it gives more detailed information.

Regarding Test 3 we clearly see now that for tasks that need to interact with cursor only (it's to say no keyboard interaction or reflection at all), the mouse is obviously really faster.

What we could assume is that for Test 1 and 2 results is that :

  • The homing time between keyboard and touchpad is smaller than the one between the keyboard and mouse.
  • The time needed to acquire the mouse is longer than the one to acquire touchpad, probably because of the fact that touchpad doesn't move and doesn't have to be grabbed.

Thanks to these small reductions of time the touchpad can stay competitive with the mouse, but it's really only if we use it in association with the keyboard.

Unfortunately, because of the error rate due to the instability of our solution (mostly on the clicking part), 3Claws tests results are hard to compare with the mouse and touchpad devices.

What we can say is that with better accuracy and stabilization most of the users tested said that it could be something really useful in order to write documents for example or for people using a keyboard a lot:

  • 1 tester said that she would not use it because she's used to use a mouse, she also doesn't like to use touchpad but she told us that for people using touchpad it could be a good improvement.
  • The tester from history master said that it would be really useful for people that have to write a thesis of big documents like she has to do. She also said that a lot of people tend to be lazy and this method could be perfect for them.
  • Another tester said that with better accuracy the system could be useful but only if we were able to relax our hand after some movement because it can be a bit painful.

Instability comes from the fact that there are lots of saccades in the pointer movement when moving the 3Claws. Another problem, that some users explained, was the fact that it was easy to lose the pointer because of these unexpected movements and it was also hard to control the click because clicking was making the pointer move a bit.

The camera-based solution was probably not the best one to implement, we then thought on how we could improve it, you can find our ideas of improvement in the rest of our study.

Perfect implementation assumption

We tried to make some suppositions on how it should look like if our proposition was perfect. Here are the following results:

Let's say that the time for a test 1 and 2 is composed of these steps

  • Move to the window
  • Click (on the text field)
  • Homing time
  • Tap text
  • Homing time
  • Click (on the ok button)

The average time of these tests is around 4100 ms so we have

  • time1_2 = move + 2xclick + 2xhoming + tap text

For the time with a click only (test3) we have

  • Move to the window
  • Click (on the ok button)

and it takes around 1200 ms so

  • time3 = move + click

then we have

  • time1_2 - time3 = click + 2xhoming + taptext = 4100 - 1200 = 2900ms

assuming one click is around 200 ms we have

  • 2xhoming + taptext = 2700ms

for the words we have here typing text is around 1500ms so

  • homing = 1200ms

This supposition looks good, we can consider that it contains a Target Acquisition time (around 100ms to 300ms) so let's say that the "real" homing time is around 900 ms, we can remove the time to activate 3Claws (around 400ms). it means that for each action we can win around 500ms.

By having these suppositions and by taking into account the standard deviation and errors with a "perfect" implementation of the 3Claws we should have something like this:

Assumption of perfect test results

In this case, we can see that this solution could improve a lot the efficiency of people doing similar action repetitively. Each of the action could be 0.5s faster so it could already be a huge improvement.

Conclusion

As explained before, our solution doesn't really permit to validate or refute our hypothesis because the version of our prototype is not stable enough.

With the data we have, we can simply deduce that for the actual prototype the instability and the inaccuracy of out solution make it really less efficient than mouse or touchpad devices (6 times less efficient actually).

During our project, we had lots of ideas, from ourselves and from testers, about how to improve this prototype but it generally asks for changing the whole software and hardware part. We created, for example, the LEDs Glove to avoid the change of color depending on the ambient light but it was finally not that's more stable because LEDs were too small. Using infrared LEDs instead of RGB LEDs could be a better idea.

Thanks to this study we noticed that touchpad is already an improvement of the mouse for text editing tasks (writing documents, filling forms, coding etc...). The fact that touchpad is in a non-moving position, known by the user and the fact that it's nearer to the keyboard makes it more effective for these kinds of tasks. The data and debriefing with testers make us believe that our idea could be an improvement of the touchpad (and so an improvement of the mouse) for this same kind of tasks, if implemented in a more accurate way. In fact, regarding our main idea (the idea of fuse the keyboard and cursor device to reduce homing time), we can suppose that the results should be even better than touchpad with a nice implementation because it put the pointer device nearer to the keyboard (on the keyboard actually) and it's available at any moment so user know its position (because it's in his/her hand).

The use of the camera was not a good idea for our project and was not stable enough like some study presented in the state of the art part said, even for a prototype. To be able to have interesting results we should find a better way to create our solution and maybe other gestures if it's still painful for some of the users.

How to enhance our solution

Hardware Improvement

More accurate camera

The prototype is based on low-resolution camera so the finger detection and movement detection could be more accurate and precise with a better resolution. For now, we only use tape to detect the different fingers. The color of the tape we want to detect change depending on the light for example. The best improvement would be to completely avoid the use of tape and to detect automatically finger, but it would also ask more computations.

Night vision

Another problem is that it's hard to detect fingers under low light. A good improvement, for example, would be to use the same technology as Kinect2.0 depth sensors and to then compute the position of each finger. Computation would certainly be more complex but the data should be acquired in a faster way and we should not have the saccade problem.

LEDs Gloved used for finger detection

We also had the idea of adding a white light on the webcam, it would also improve the finger detection by applying the same ambient light to all the finger so that the color of the tape should stay quite homogeneous but it would still no be perfect.

Another way would be to create a hand device with colored led on each finger and with a filter on camera so that the color is always the same under any kind of light and is still visible in the night and can go through the filter. So we could use this technology to detect fingers without taking into account the rest of the environment thanks to the filter. You can find the picture of the glove just right here. We tried this solution but the light was not bright enough and the LEDs were too small. Using an infrared camera and infrared LEDs instead would be even easier to detect the fingers without stabilization problems. In fact, by adjusting the frequency of each 3 LEDs we can differentiate them easily and there would have no problem of variating ambient light.

Our last idea is to use the DataGlove to record the different events but we should have to use it in addition to one of the previous solutions to be able to detect the hand movement.

Software improvements

Drag n Drop (or Selection) features

The drag n' drop (or selection) could be implemented easily by changing the "click" interaction by a "press & release" interaction instead. The general idea would be:

  • Press: Put your finger up
  • Release: Put id down
  • Click: Put it up then down

The problem is that it would ask for totally different kinds of tests and also it is not the main goal of our improvement for now.

Slider/Wheel Emulator feature

Something that could be useful when filling forms or writing documents would be the use of the slider. Unfortunately, it would take too much time to test this feature also for this project so we decided to focus on the move and click parts that were not so stable. The idea we proposed was to keep the index and the middle finger together and to move them away from the thumb. It could be compared to a smart gesture present in some touchpad, allowing to slide windows bar using 2 fingers.

Here is an idea on how to improve the skeleton in order to implement this feature:

3Claws_Action(){

	if ( (ActualDistance_Thumb_Index<= Thumb_Index_Threshold) && 
             (ActualDistance_Index_MiddleFinger <= Index_MiddleFinger_Threshold) && 
             (ActualDistance_MiddleFinger_Thum<= MiddleFinger_Thumb_Threshold)) 
        {
		if (!is_3Claws_Activated())
                {
                	set_3Claws_Activated(true);
                }

                //Move the cursor
	}
// Wheel Emulator tests
	else if  (ActualDistance_Index_MiddleFinger <= Index_MiddleFinger_Threshold) 
        {
		if (is_3Claws_Activated())
                {
                	//Emulate Wheel
                }
	}
// End of wheel Emulator tests
	else if  (ActualDistance_Thumb_Index<= Thumb_Index_Threshold) 
        {
		if (is_3Claws_Activated())
                {
                	//Emulate Right Click
                }
	}
	else if  (ActualDistance_MiddleFinger_Thum<= MiddleFinger_Thumb_Threshold))  
        {
		if (is_3Claws_Activated())
                {
                	//Emulate Left Click
                }
	}
	else {
		set_3Claws_Activated(false);
	}
}

Improvement of the mapping function

We finally realized that the fact of having a direct space mapping was a bit confusing because mouse pointer can be lost quite easily. A way to improve this feature would be to add a fast animation from the previous position of the pointer to the new one or more simply to use a well-adapted relative space.

Another problem is that the camera image frames are mapped directly to the screen, even if there is a problem of perspective. An idea would be to compute the rotation and translation of the frame in the camera space to map it in a better way to the screen.

Initial frame without perspective modification
Same frame with perspective modification

See Also

References

[1] A. Erdem, E. Yardimci, Y. Atalay, V. Cetin, A. E.: “Computer vision based mouse”, Acoustics, Speech, and Signal Processing, Proceedings(ICASS) IEEE International Conference, 2002

[2] Chu-Feng Lien: “Portable Vision-Based HCI – A Real time Hand Mouse System on Handheld Devices”, National Taiwan University, Computer Science and Information Engineering Department

[3]Abhik Banerjee, Abhirup Ghosh, Koustuvmoni Bharadwaj, Hemanta Saikia :Mouse Control using a Web Camera based on Colour Detection, Department of Electronics & Communication Engineering,Sikkim Manipal Institute of TechnologyEast Sikkim, India

[4] Hojoon Park , A Method for Controlling Mouse Movement using a RealTime Camera ,Department of Computer Science Brown University, Providence, RI, USA

[5] Kamran Niyazi, Vikram Kumar, Swapnil Mahe and Swapnil Vyawahare : Mouse Simulation Using Two Coloured Tapes, Department of Computer Engineering, AISSMS COE, University of Pune, India