Whether you are sitting on a bench with your laptop, at home behind your desktop, or on a walk with your smartphone, it is likely that you will be searching the internet. It is part of personal and business life, which is one of the reasons why some of the wealthiest companies in the world are now the likes of Yahoo! and Google.

Billions of websites exist, however, and they all want to be found on the main search engines. To ensure people actually find what they are looking for, search engines have created a number of rules. Those who best meet those rules will be placed at the top. All search engines have their own set of rules, but they tend to all focus on keywords or keyphrases, and their frequency in a piece of content. However, this is where polysemy comes into play, which means that words with multiple meanings can become confused. Similarly, synonymy, where the same idea is described using different words, is also a problem.

How Search Engines Separate Information

To make sure useful information is shown at the top, search engines look at three key characteristics, being:

  1. That the website has as much relevant information as possible.
  2. That the website has as little irrelevant information as possible.
  3. That ranking is meaningful.

Some websites, including Google, also assign popularity depending on how many people visit a page, and how many links are sent to a page. The more links and the more visitors, the more important that site is seen as. However, for all these controls, irrelevant sites still regularly appear, but professionals like Abhishek Gattani have found two solutions for that: Latent Semantic Indexing (LSI) and Semantic Web Documents (SWD).

What Is LSI and SWD?

LSI is a method whereby information is retrieved and organized based on higher-order word associations compared to text objects. It represents the data’s biggest associative patterns. Hence, rather than looking solely at how many keywords are there, it looks at the real content of the website. With semantic web documents, logical connects are created between different pieces of content.

Gattani has worked very hard at developing semantic search engines. His goal was never to do this for the wider market, however, since performance problems are very common if the SWD becomes too large. Instead, therefore, he has created networks of information, in which only the right path to a certain solution is found.

In fact, Gattani was one of the first to be quite successful in this. Before then, the systems found it very difficult to understand what was being asked of them. That said, since Gattani became involved in the field, things have improved tremendously. Today, many organizations use their own semantic search engines, thereby ensuring people only find the information that truly is of relevance to them, instead of having to sift through millions of websites. Whether semantic search will ever become truly mainstream remains to be seen.