GuitAR Learning Assitant in-Augmented Reality

De Ensiwiki
Aller à : navigation, rechercher
Project schedule.png
Titre du projet GuitAR - Learning Assitant in-Augmented Reality
Cadre Projets de spécialité
Page principale GuitAR Learning Assitant in-Augmented Reality

Encadrants François Bérard




Guitar music exists under many forms: folk, classic, or even electric. But like every instrument, the beginning is hard and while going through this long and fastidious road of learning the basics, a person needs to know what are chords, bar chords, power chords, among others and be able to play them by heart. In order to do that, nowadays people usually either request the help of a music teacher who explains the theory of music and shows how to position the fingers on the guitar or learn by their own means by watching other people play or by trying to decipher tabs. The former can be very expensive and not accessible to some people while the latter can be very frustrating since the learning curve can be very steep due to having to discover on your own as you go.

Our goal is to smooth the learning curve of playing guitar, specifically chords, by providing novice musicians an autonomous system.

State of the Art

Many solutions exist to help beginners. The first issue that occurs to new players is the difficulty to read a sheet music. Indeed, it can be relatively complex to translate a note in the sheet music and to know where to put the fingers on the frets. The use of tabs facilitates the reading. Indeed, every crotchet, quaver, etc., are now expressed as fingers position on the guitar.

Many applications such as Gibson Learn & Master Guitar or Jamstar, as well as many others, displayed a schema of a guitar and the position of the fingers enlightened in the frets for every chords.

Figure 1 - Jamstar application Screenshot

However, all of these applications or translations share the same issue: the user loses sight of the guitar and his/her hand having to watch the tab or the picture of chords.

Other systems have tried to use Augmented Reality to avoid this issue.

Figure 2 - Video 1 Screenshot
Figure 2 - Video 2 Screenshot

These screenshots come from Video 1 and Video 2.

The first video proposes a system where the position of the fingers on the fret is displayed above the handle. The idea is interesting, but the system does not show you which fingers, specifically, should go where. The second video displays the position of the fingers on the screen of a mobile phone. It seems rather difficult to place the phone in such a way that the user might be able to see the system while at same time playing freely the guitar. It does not seem like a very usable system.

The approach

To improve the previous solutions, we decided to use Augmented Reality to show the position of the fingers directly on the guitar. With this method, the user can learn a new chord while still keeping sight of the guitar.

To accomplish this, we developed a prototype organized by the following steps:

  • Detection of the handle of the guitar
  • Computation of the homography matrix
  • Display of the chords in the real world

The homography matrix is useful to map a point from the frame of the guitar to the frame of the display. To compute it, we need 4 detected points.

Prototype choices

Some choices have been made for the production of the prototype.

  • The display

Our first idea was to use glasses, like Hololens or Google glasses to display information directly through the user's vision. However, available devices were not usable for two reasons:

- First, they weren't stable enough on the head (the helmet fell when the user tilted his/her head).

- Second, the display was far more darker than the reality or colorless, thus, no acceptable vision was possible.

Our solution was to use indirect vision, like a mirror. Through a real-time webcam video feed, we display information on the computer screen.

  • The markers

To find the position of the handle, we tried several solutions. The first one was to use black and white markers in order to detect regions with high contrast. These kind of markers are usually replace with a 3D models. In our case, the markers must be very small to hold on top of the guitar while not interfering with the sound of the latter, however, to properly detect the high contrast, a minimum size is required; a minimum size that does not satisfy the requirement stated previously. Due to this constraint, we used a second type of markers: reflective markers. These kind of markers reflect all the light that arrives onto them. The idea was to detect saturated pixels on the webcam, however, some strings of the guitar, the fret delimitation and the armature of the guitar are in metal, meaning that there is a lot of specular reflection interfering with the detection of the 4 actual markers. As such, we finally chose to use 4 green stickers. We opted for green simply because it is the least present color component in the human skin.

Implementation of the system

Our system is impemented with Processing.

Detection of the 4 markers

We take a group of 9 pixels around one location and check if all of these pixels verify the detection condition, which has at least a certain amount of green in the RGB component, and less than a certain amount of blue and red in the RGB component. Before each experience, these values need to be calibrated, because it depends on the luminosity of the room.

 return green(px) >160 && red(px)+blue(px) <300;

All detected points are then colored in red.

Then, we create a list of detected clusters. A point is a group of components x and y. A cluster is a group of close points; in the code, the way we define them is as a point with a number of points around the latter. We don't take the mean of points of the cluster, to avoid more computation and tackle the latency.

If the list is empty, the first point detected is added to the list (and form the first cluster). Then, for each new point detected, if the distance between this new point and previous detected clusters is more than 30 pixels, we add the new point to the list (the distance is arbitrary). If the distance is less than 30, that means the point is close to another already added cluster, meaning that the point belongs to the cluster. We keep track of how many points have been detected in each cluster.

This solution ensures to have a list of clusters, at least separated by a radius of 30 pixels.

We then erase all clusters smaller than 3 points. The purpose is to avoid the detection of a cluster with no real meaning, to remove noise.

If only 4 clusters are in the list, we assume that only our markers are detected. We sort this list of clusters, to identify the four markers, in the right order.

Sorted markers

To improve the stability of the system, we have implemented two features:

  • If the detection is lost, ie. the markers are no longer seen by the system, we use the last detected position of the markers to compute the position of the chords. However, if the detection has been lost for more than 8 cycles (arbitrarily chosen), we consider it is lost for good and no more chords are displayed.
  • If the position of a marker has moved of less than 3 pixels, we consider it hasn't moved, in order to avoid trembling.

Computation of the homography matrix

We know the position of the four markers in the frame of the guitar and their positions in the screen. We can, thus, compute the homography matrix with the 8 equations (2 by points). We obtain the following :

 M = \begin{pmatrix}
  P_2.x - P_1.x & P_3.x - P_1.x & P_1.x \\
  P_2.y - P_1.y & P_3.y - P_1.y & P_1.y \\
  0 & 0 & 1

The fourth point is not useful because it is in the same plane as the 3 others (we consider the guitar in a plane with z =1).

Drawing of the chords

We chose to draw three different chords, Mi Minor, La Minor and Sol. To change the chord to be displayed, we press the buttons 1 (Mi Minor), 2 (La Minor) or 3 (Sol).

To draw a chord, there are several steps:

  • Find the coordinate of each finger from the real world to the guitar frame.

The guitar is 3cm in width (between P1 and P3), and 14.5cm in length (between P1 and P2). P1 is the origin of the guitar frame and the origin in the world frame.

For example, measuring from P1, the position of the index of the La Minor is 2cm in length, and 0.5cm in width. So, in the real world frame, we have P_{world}(2, 0.5, 1). In the guitar frame, we bring back coordinate between 0 and 1. So we have P_{guitar}(2/14.5, 0.5/3, 1) in the guitar frame.

  • Find the coordinate of the finger position from the guitar frame to the screen frame.

We use the homography matrix. P_{screen} = M * P_{guitar} So finally, we have

P_x^{screen} = M^0_0 * P_x^{guitar} + M^0_1 * P_y^{guitar} + M^0_2

P_y^{screen} = M^1_0 * P_x^{guitar} + M^1_1 * P_y^{guitar} + M^1_2

P_z^{screen} = M^2_0 * P_x^{guitar} + M^2_1 * P_y^{guitar} + M^2_2

  • Draw the chord

We can finally draw the chord. We draw a diamond around the position P_{screen}.

In every chords, each finger is associated to a color: The index is red, the middle is green and the ring is blue.


Experiment Protocol

For the experiments, we were searching for users that no experience what so ever in playing the guitar so that he/she would look at the screen to see where to place the fingers and not be proficient enough to do it almost without even looking. We used a webcam to record the handle of the guitar, and we displayed the video on a screen computer.

We organized our experiment protocol according to the following steps:

1) Present a sheet of paper/window in the screen with several chords displayed as tabs and explain how to read them

Figure 4 - Users with our system

2) Ask the user to watch and try to play them at his pace, leave him to practice for 5 min

3) Once the user feels comfortable, ask him to play the following chords in this order 3 times and time it, as well as checking the number of errors the misplaced the fingers: G E-min A-min or Sol Mi-min La-min

4) Ask the user how he felt using this system

5) Present our system and show the user how it work

6) With the user, thanks to keys we binded previously, we change the chords displayed on the screen and leave the user to practice for 5 min

7) We redo the same chords done for the sheet of paper and time it again, as well as checking the number of errors

8) Ask the user how he felt about the system compared to the previous one.

Figure 4 - Chords Tab sheet
Figure 4 - Users with our system
Figure 4 - Users with our system

In order to have unbiased data, we made so that half of the users tested did steps 1-4 first while the other half did steps 5-8 first.

Interpretation and evaluation of the data

Quantitative Evaluation

While doing our user test study, we encountered a big number of problems since our system is not environment friendly, meaning that since we were using green surfaces to do the calibration, any kind of green detected by the webcam other than the one of stickers on the guitar would completely mess with the calculations. Even with improvement like erasing too small clusters, or keeping the last detected position of the markers in case of lost of detection, did not totally stabilize the prototype.

We make experiences with 9 users, all beginners in guitar.

Results of the experiment

We can see that in average, people needed more time to complete the task with our system than with the traditional tool. We interpret this result as a consequence of technical issues. Indeed, the latency of the camera, and sometimes, the lost of detection slowed down the user's performance.

We also analysed how effective was the user in performing the task with both systems by counting the number of errors he/she made, ie. if the chord was played with the fingers in the right position.

Number of errors

The number of errors was greater with our system than with the tabs system. We interpreted it as a lack of precision in the display of the chord, due to an approximation of the detection of the markers.

Qualitative(system usability scale) evaluation

When we asked users which system they thought was better, the overall opinions were mixed.

One aspect with which they all agreed is that the conventional method can be very frustrating since you are constantly toggling your eye direction between the screen where the tab is and the frets of the guitar. Additionally, the task we gave to the user is quite simple because we are only considering chord strumming, meaning that the user does not even need to look what his right hand is doing. If we upped the difficulty of the task and asked him/her to play individual notes, not only would it be necessary to watch both the screen and the frets, but the user also would need to look at his right hand to know precisely which string he needs to pull. Given this fact, the users found it hard to follow the chords and to place the fingers with the conventional tools. Furthermore, the tabs given to the user were vertical, whereas the guitar is hold horizontally. An additional effort was required to translate the position in the tab into the reality, whereas it is pretty immediate with our system.

However, when it comes to our software package, the opinions diverged a bit. Some said that they liked the idea because, you do not need to constantly look at the your own guitar, since there is a video feed of yourself in the computer screen in real-time. Others complain about the camera resolution, that it should be higher, that the colors associated to each finger should be more vivid to notice them more easily on the screen, they complained mostly about the choice of the black color. Users also complained about the glove that was not very comfortable and should be changed. Overall, users liked a lot the idea of our system and in some cases they preferred the latter to the tab approach, even though that most of them took more time using our method than using the tab approach. Naturally, as mentioned earlier, some users said they preferred the conventional tool mostly because of the technical limitation we have where we are searching for green spots and the environment could have green on it, messing up the calculations greatly and making the icons, that indicate where to put the finger, flicker a lot or be wrongly placed on the guitar. One advantage of the traditional system is the fact that the player can anticipate the movement of his/her fingers, by reading one note ahead. This feature is not implemented in our system, and can also explains the difference of time to complete the task.

Just as a last note, our system helps the user to know what is the correct way to place the fingers. This is an important feature that should maintained since certain chords transitions can be preformed much more easily and faster if the fingers are in the right position. The only difference being that conventional tools use numbers to identify our fingers whereas we use a color scheme. An example the importance of this is the transition from A-min to C-maj. So the pressured strings when doing the A-min chord, taking into account the standard numbering where the highest note string is the first string and sixth string for the one with the lowest note, are index finger on 1st fret 2nd string, middle finger on 2nd fret 4th string and ring finger on 2nd fret 3rd string. When you transition to C-maj from this position, it becomes very simple and quick: both index and middle finger do not move and the ring finger simply just changes position from 2nd fret 3rd string to 3rd fret 5th string. So as you can see, thanks to this specific positioning the user only has to change the position of one of the fingers which would have not been the case in any other way.

To summarize the pros and cons of our system are the following:

  • Positive points :

- It is not necessary to look at the guitar (better positioning)

- Do not need a teacher (know the exact position of the finger)

- More intuitive (does not need an additional computation to rotate the tab)

  • Negative points :

- Latency

- Resolution of the camera

- Approximate detection

- Lack of precision

- Environment dependent

- Cannot anticipate the next note

Conclusion and Future Work

To sum up, we believe that our approach has indeed some potential, but still needs further work and study. Indeed, most of the negative points of our system come from technical limitations, and not the idea itself, since a certain number of users took more time doing the task with our system, but still liked it more than the conventional tab sheet. We should try to find a better way to detect the guitar frets in such a way that it does not have as much noise as the current version or does not need to be in a specific environment to be able to work. Also, a camera with higher resolution would help immensely the user to detect where the icons are to then see where he/her should place the finger. The colors of the fingers on the glove is something to recheck as well.

Even if we cannot conclude that our system is a better way to learn guitar than the use of tabs, the prototype and the experiences gave us lead to improve the system, and even reveal points (positive and negative) we did not foresee, such as anticipating notes. We suspect that this factor might improve the speed of execution, however further testing is needed.


Videos of the use of Augmented reality:

Video 1

Video 2.

Helpers to learn guitar:

Tutorial piano example


Source code and pictures:

GitHub of the source code

Inspirational source :