Consistency As The Strength Of AI In Keywording

Through the Finnish Food Authority’s Hyrrä transaction service, it is possible to apply for several different rural development subsidies. Received project applications are classified in the application processing e.g. with the help of keywords. Each received application receives one or more keywords describing the content of the project from the project handler. A successful subject indexing enables a later search for information, when all project applications attached to the same subject word can be found quickly.

Headai’s goal was to find out how artificial intelligence can be used to support the application process and, above all, its uniformity. Does artificial intelligence recognize as many words as a person? Are the keywords chosen by people and artificial intelligence similar?

“Artificial intelligence offers support for smoother and more uniform processing of project applications and the preparation of applications. Headai clearly understood the customer’s background and needs, which is not a given.”–Tuomas Metsäniemi, Network Expert, Finnish Food Authority

Better results through teaching

The text data consisting of Hyrrä’s project applications was imported into Headai’s system, and the database used in official subject indexing was connected to Headai’s computational ontology. Ontology can be understood as a machine’s understanding of the concepts of the surrounding world and the relationships between concepts. The ontology is based on a pre-taught language model based on gigabytes of text data from various fields, such as science, culture and the labor market. At first, our AI performed the subject indexing based on its previous understanding.

After this, time was spent teaching artificial intelligence. Headai went through 180 individual project descriptions, comparing human and machine subject indexing and writing down words that were not found by the machine as well as words given by people that were imprecise for one reason or another. Index terms that were not found were taught to the machine to improve the quality of the work.

“In total, the artificial intelligence chose an average of 4.9 index terms for one application, which is slightly more than the number of words chosen by a person.”– Jessica Nielsen, Environmental Scientist, Headai

After teaching the AI, the subject indexing was done again by the machine and the results were evaluated. There were an average of 3.9 index terms given by a person per application. The machine found an average of 1.3 similar index terms and an average of 3.6 different index terms. In total, the artificial intelligence gave an average of 4.9 words per application, which is slightly more than the number of words given by people.

Another summary of the number of index terms was made, this time using limited data. Only 66% of the written applications were included in the calculation (17% from both extremes of the dispersion were filtered out). The number of index terms found by the machine increased to 5.7 words on average. Of these, an average of 1.7 words corresponded to terms chosen by people and 4.0 words were different.

The effect of teaching on the result was therefore clear. When the number of words found before the teaching was compared to those found after the teaching, the average effectiveness of the teaching on the index terms chosen by the artificial intelligence was 1.1 words. However, it is worth noting that even without training, the artificial intelligence found a bigger number of relevant keywords in the applications than a human had found.

Index terms open to interpretation

In connection with the training of the machine, it became apparent that a surprisingly large proportion of the words given by humans were imprecise in one way or another in relation to the description of the project. At the same time, the words given by the machine turned out to be of more uniform quality than the original ones, because the machine intuitively cannot add meanings that are not directly found in the description of the project.

The person knows the context around the project under consideration and knows how to combine things relevant to the description from outside the description. On the other hand, people classify things not only through the text, but also through their own worldview. This can lead to a situation where a project presenting cultural history and art might have women as its index term, although the desired target group would in reality be wider.

Imprecise terms or terms open to interpretation could have been chosen for many reasons. An imprecise index term may be based on information held by the project handler, which is not recorded in the project description. It’s also possible that the index term is based on a conscious or unconscious presupposition that the processor has about the project’s audience, implementation or result. Even if a person’s intuitive choices may be very apt and true, they can still be problematic when looking at the uniformity of the subject indexing.

“It seems to be essential for the uniformity of the subject indexing that there is a shared understanding of how broadly or narrowly the words are understood in this context.”–Essi Helander, Linguistics Scientist, Headai

A particularly interesting discussion with the client was sparked by an example, where the scope of the word used as index term had not been precisely defined, which led to imprecise results. This got us wondering, does every person handling the project applications have the same point of view on, for example, equality? Is it clear what equality means as the Finnish Food Authority’s index term, or can everyone attach their own meanings into it? It seems to be essential for the uniformity of the subject indexing that there is a shared understanding of how broadly or narrowly the words are understood in this context.

In terms of uniformity, the support of artificial intelligence can be very significant, as it always selects index terms unambiguously and reproducibly based on the description, and does not know how to look for broader meanings outside the description. The machine does not offer far-fetched index terms, but chooses based on exactly what is mentioned in the application.

Artificial intelligence to support subject indexing

The machine has no preconceived notions and cannot make intuitive decisions. Could it help a person to write more evenly and accurately? At Headai, we believe that it can. With the help of artificial intelligence, the person handling applications gets ready-made keyword suggestions as a basis for their work, which they can accept or reject based on the project description.

Tuomas Metsäniemi, Network Expert at Finnish Food Authority, also believes in the possibilities of artificial intelligence:

“The results of the AI pilot ordered from Headai were really interesting and raised a lot of ideas about the use of artificial intelligence.”

Subject indexing is just one example of how artificial intelligence can be used to support project application processing at the Finnish Food Authority. The purpose of the pilot was also to start a discussion about lightening the administrative burden and enabling more uniform and faster work.

Machines cannot compete with humans in intuitive reasoning, but the regularity, repeatability and uniformity of their operation could be a significant aid in human decision-making. Fortunately, a person is not a machine, but with the help of artificial intelligence, we can use the good features of the machine to aid our work.

Finnish Food Authority

The Finnish Food Authority works for the good of humans, animals and plants, supports the vitality of the agricultural sector, and develops and maintains information systems. The Finnish Food Authority began its operations on 1 January 2019 when the Finnish Food Safety Authority, the Agency for Rural Affairs and part of the IT services of the National Land Survey of Finland were merged into one single Authority. The Authority operates under the Ministry of Agriculture and Forestry, and its head office is located in Seinäjoki. The Finnish Food Authority’s activities cover the entirety of Finland, and the Authority employs almost one thousand experts and professionals in nearly 20 locations.

About Headai

Headai is a Finnish technology company developing a cognitive AI engine powering economic growth. We help organizations succeed in a rapidly changing future by helping them find answers in large amounts of data that they can’t otherwise see.

Our algorithms enable seeing the big picture in scattered data by revealing unknown connections and even explaining why they exist.

Our technology is 100% Headai IP, based on over 20 years of experience in the cognitive and computational sciences.