 |
|
When it comes to chemistry terminology, one person’s sodium chloride is another’s salt. Image courtesy Snack/stock.exchng
|
Like any other language, the language of chemistry lacks uniformity.
New words are invented, old words fade out of use, styles of writing change and some writers suffer from less-than-perfect grammar.
What’s more, there is no single way of referring to a chemical: one person’s salt is another’s sodium chloride (and yet another’s NaCl). To search for a specific word in a chemistry text, a researcher must take into account every permutation of that word and every possible mistake in representing it.
This is highly inefficient, and with more sources of chemistry information becoming available every day, it’s not getting any easier to find relevant information.
But now, there’s OSCAR to help. Also known as “Open Source Chemistry Analysis Routines,” it is an open-source software package developed at Cambridge University for the semantic annotation of chemistry papers. It is closely integrated with OMII-UK, an organization which seeks software solutions for e-research.
“I see chemistry as a language,” explained Peter Murray-Rust, leader of the OSCAR research group at the university, “one that is communicated in natural language, graphics, formulae and equations.” This insight led to the development of OSCAR for reviewing literature to identify information relevant to chemistry research. The software has gained some prestigious advocates, including the Royal Society of Chemistry.
The software’s primary purpose is to recognize concepts in text that have a precise meaning. Murray-Rust says that it not only recognizes chemical names, adjectives and processes, but is able to link them into their meaning using an ontology — a rigorous and exhaustive organization of some knowledge domain that is usually hierarchical and contains all the relevant relationships.
By using such a system, the researcher is freed from having to hunt for every permutation of a specific word, because OSCAR automatically links the word with its alternatives.
It also can enrich the text-search by providing further information about the terms it identifies, such as chemical properties and molecular structure.
|