Mapping Arbitrary Points to Arbitrary Words in Text

The Problem

One of the major issues I ran into while working on OpenBKZ (an Ebook reader written in C++ using the Qt framework) was converting the (x, y) coordinates that were on the screen the words the highlighted/searched over. I needed this for two (in my opinion) key features:

Notes – It is important that a user is able to highlight a section of text and write a note about it.
Search – I want a person to swipe a word/phrase and either open it in Google or a dictionary in a menu (the menu is still a work in progress).

It was a reasonably difficult problem to solve and there are more efficient solutions, but those created restrictions that were undesirable. If it was not for the constraints and the fact I enable font sizes to change this would have been much simpler. I believe/hope some people could find this interesting and useful. I spent 3 – 4 hours trying to get it to work and I would rather save someone else those hours.

Solution

First, we must identify the x-axis and y-axis. The picture below has a red line representing the different dimensions:

This is pretty straight forward, but it clearly identifies that the y values will be used to calculate the line and the x values will be used to calculate the different words.

Vertical Translation

Obtaining the line number is not exceptionally difficult. We simply need to obtain the starting point and ending point of the gray rectangle in the image above, which Qt provides:

The next step is to convert the starting and ending y position to line numbers, to do that I used the following:

To convert each line all we have to do is take the font size + 3 (there are 3 points is the space between lines), so we simply take the starting and ending y position divided by the font size plus three.

All is well and good, that took about ten minutes to figure out and I am sure most of the people who read this could figure this out. Now for the interesting part, converting arbitrary horizontal positions to words, the challenge being that font size can change (everything must be dynamic).

Horizontal Translation

To achieve this task I had to do a bit of research and I found this image[1]:

The standard width to height ratio being w : h = 3 : 5 = 0.6 for each letter, and each space is 1/3 the size of the letter. This alone would be simple enough to map the horizontal position to words, but the issue with this is really four fold:

The way the application scaled the text did not ensure the text was always aligned with the left most margin (though I could potentially fix this).
Anything that adjusts for the above must also adjust for smaller or larger text:
Varying length of words can create an issue.
The user will often swipe to far, there needs to be a reasonable margin of error.

After some reflection it’s possible to really relate all of these issues to the same thing. All of the issues above stem from translating a start point to an end point, which comes from not knowing precisely where each word is going to relate to each position. Therefore, there needs to be a method to translate the start point and end point based on two factors:

Size – a combination of word length and font size.
Error – allowing for a margin of error for both the position to word translation and user input.

Allowing for the error is the most important aspect of the solution, because it then allows us to approximate, rather than a direct translation of two points to words.

The first step in my solution is translating the start and end point to indexes in an array for the given line:

The above code gives you two variables (start_search and end_search) which represent two indexes in the line array. Variable “ypos” is the y position, which we translate to a given line by dividing by fontsize + 3 (refer above the vertical translation). You may have noticed that the index variables are generated by taking the horizontal position, dividing it by half the font size. I do this because width : height ratio of each character is 3 : 5, but to minimize error I divide by half or I act as if it is 1 : 2.

Once the proper line has been discovered an errorAdjustment is performed:

The output would be the a QString line which contains the word you highlighted. Based on about 50 attempts at various sizes and wild swipes I received two errors, it seems to have 90%+ accuracy at converting my term correctly.

Dissecting the code:

Variable “space” represents the number of characters in a given line divided by twice the font size. I use this to adjust for the previous adjustment, which assumes w : h to 1 : 2 (as opposed to the actual 3 : 5).
I then remove characters the last index I would be interested (end_search), plus “space” which increases some characters kept. This adjustment is made mostly because assuming the 1 : 2 ratio reduces the assumed width of every previous character, this means there will be a larger error on the right side mapping the position to the left. Slightly confused? Here’s a picture of what I mean:
Following that I remove characters previous to the first index I would be interested in (start_search), minus “space” divided by two. This increases very few characters kept, I only do this to obtain terms such as ad hoc, or R. Walton where a short word would be obtained and improve the search (perhaps a poor assumption).
I then compare the difference between the various “terms” derived from this method, with the actual words in the line. If the word contained in the QString array “term” is contained in an actual word in the line we are searching and the word contained “term” is over half the size of the actual word (our naive margin of error) we add it to our search string.

Now, lets see it in action:

Related Articles

The Problem

Solution

Vertical Translation

Horizontal Translation

Leave a Reply Cancel reply