The Investigation Software Company

Indexer

The problems of having valuable information which is accessible only by keyword search is well understood. This is often because it is in unstructured narrative form in:

  • a text field captured as part of a larger structured record
  • within documents like PDFs, emails, forensic reports that may be attached to a record of interest
  • an assortment of general intelligence material captured from third party sources

Xanalys Indexer enables the rich content in unstructured text to be extracted and represented in structured form which can then be used for more powerful matching, searching and as the source of advanced analytical techniques within your system.

Xanalys Indexer implements Xanalys’ patented PowerIndexing process: Information Extraction technology which automatically extracts entities (e.g. occurrences of people, addresses, events, organization etc.) and relationships (e.g. “<person> lives at <address>”, “<person> is married to <person>”, “<person> has telephone number <telephone>”) from unstructured narrative text.

Xanalys has many years of experience in the Information Extraction (AKA text mining) field with Indexer being an integral technology used in a number of larger systems such as Xanalys PowerCase. The extraction technology has been tailored over time to extract “interesting” information from text in a way as to support the analytical process in diverse domains.

How Indexer helps you process your textual content:

  • Automatically identify real world entities. Unlike keyword search which relies on your knowledge, PowerIndexer automatically finds people, places, events, etc. and relevant relationships between them.
  • Provide a unified view of your data. Instead of simply identifying different textual occurrences of an entity in a document, PowerIndexer automatically merges the entities that refer to the same real world object to create a single view.
  • Turn unstructured content into a structured knowledgebase. Once in structured form, you can apply more powerful analytical techniques to further process the data.
  • Embed in your own applications. Provide your users with the tools to manipulate and use the extracted structured data within a native environment.  

Features

 
Extract

  • Identify entities that are important to the investigative process from within unstructured text.
  • Extract entity attributes including meta data about the textual position.
  • Identify relationships between entities based on narrative content and proximity.
Infer

  • Identify the referent of pronouns enabling person attributes to the attached to the correct entity.
  • Infer the details of underspecified events based on the narrative text that precedes it.
  • Decide when two references within the text refer to the same entity and therefore merge (or unify) them.
Customize

  • Augment Indexer’s word lists with domain specific information to aid the recognition of entities.
  • Customize the recognition process by selecting groups of matching rules to use for a given input.
  • Change the matching rules used to identify entities that refer to the same real world object.
Integrate

  • Has a comprehensive COM API to enable integration into a larger solution.
  • All customizations can be performed using API functions prior to the commencement of indexing.
  • Different methods of passing text in and getting structured results back.
Transform

  • Transform the input text to include embedded XML like structured entity definitions.
  • Transform the entities and attributes that Indexer outputs.
  • Transforms event details into database format.