Daniverg’s Blog

mayo 16, 2009

Getting to know better SARA and Xaira…

Filed under: LR — Daniel Vergara @ 9:34 am
  1. ‘SARA (SGML Aware Retrieval Application) was developed specifically for access to the BNC in a Microsoft Windows environment. It is freely available to all BNC licensees and also for registered users of the BNC Subscription service hosted by the British Library. A copy of SARA is delivered with every copy of the BNC World corpus. You can also download the latest version here. The SARA webpage offers more information about SARA.

The Xaira program derives from SARA but has been developed further. It can be used on all well-formed corpora in XML. The BNC XML Edition, BNC Baby, and BNC Sampler corpora are delivered with a copy of Xaira. You can also download the latest version of XAIRA from SourceForge.net. More information about Xaira can be found on the Xaira webpage.’ Information retrieved at 10:43, May 16, 2009 from http://www.natcorp.ox.ac.uk/tools/index.xml.


SARA allows investigations on the content and structure of a corpus. The precise searches and enquiries possible on a given corpus will of course depend upon the nature and completeness of the markup applied to it. However, indicatively, SARA supports features such as the following:

– Searches on words, truncated words and phrases
-Searches on SGML tags, attributes
-Combinatorial Boolean operations
-Frequency counts
-Lexicon, to allow identification of similar words (eg gumboot, gum-boot, gum-boots etc)
-Storing searches
-Limiting scope of queries
-Presentation of Results
-With or without SGML markup
-Page or concordance format
-Optional use of colour to enhance display

By way of illustration, some of the markup in the BNC relates to the social class of the speaker (in the case of spoken words); markup is also used to signify parts of speech. Thus, in the case of the BNC, SARA can be used to formulate a query equivalent to: How often do speakers of social class C1 use the word “input” as a verb?

XAIRA is the same thing as SARA but with more features since it is the recent MODIFICATED version of the last one.


Dejar un comentario »

Aún no hay comentarios.

RSS feed for comments on this post. TrackBack URI


Introduce tus datos o haz clic en un icono para iniciar sesión:

Logo de WordPress.com

Estás comentando usando tu cuenta de WordPress.com. Cerrar sesión /  Cambiar )

Google+ photo

Estás comentando usando tu cuenta de Google+. Cerrar sesión /  Cambiar )

Imagen de Twitter

Estás comentando usando tu cuenta de Twitter. Cerrar sesión /  Cambiar )

Foto de Facebook

Estás comentando usando tu cuenta de Facebook. Cerrar sesión /  Cambiar )

Conectando a %s

Crea un blog o un sitio web gratuitos con WordPress.com.

A %d blogueros les gusta esto: