Enterprise Search and Text Analytics for a Publication Giant


Enterprise Search and Text Analytics
for a Publication Giant

About the Client

The client is a scientific publication company.

Business need

The client wanted ALTEN Calsoft Labs to do the following for them.

  • Store huge amount of unstructured publication data
  • Search relevant documents based on search terms

Solution delivered

ALTEN Calsoft Labs’ expert team did the following to shape the client’s requirement.

  • Developed search grammars that include AND, OR, NAND, 1, 2 and 3 character search, phrase search and NEAR to name a few.
  • Ensured enhanced search performance
    • Based on the query string
    • Search will be longer for the query string that occurs in more number of documents
    • The search results are around ~1.5 secs to ~10 seconds
    • Filters are generally fast, usually ~2 seconds
  • Conducted analysis (Semantics and Stats pages) based on preset KPIs or definitions with which, an analysis can be done on the data

Business benefits

  • Easy discovery of relevant documents
  • Boolean Search Support

Technology stack

  • Platform: Hadoop
  • File System: Hadoop Distributed File System
  • Paradigm: MapReduce
  • Machine Learning Tools: Mahout
  • Language: R

Let’s fast-track your next big idea

Recent Posts
Contact Us

If you’d like us to contact you, please fill out the form.

Not readable? Change text. captcha txt