Data Processing

Processing technology and services that foster fast, efficient reviews

The production quality of electronic files is only as good as the tools used to process them. Consilio’s Data Processing service is designed for corporations and law firms that require fast, accurate metadata extraction, culling and deduplication. With over ten petabytes of data processed, we have the tools and best practices to process even the most complex data corpuses.

We use best-of-breed technology—augmented by our own innovative software—to process an array of nontraditional data types, including audio and multiparty chat transcripts—such as Bloomberg® chat. Consilio’s data processing service is supported by our consultative, experienced project management and operations teams and can be delivered across the globe in our dedicated centres or on premise at our client’s facilities.


  • Over ten petabytes of processed data, worldwide
  • Able to process over one terabyte per day, per project
  • Expertise in processing files in virtually any language
  • Support for audio, chat and custom, proprietary files

Data Processing Service Overview

Our Data Processing service is designed to facilitate an efficient review by honing in on the files most likely to be relevant and by trimming duplicate and extraneous files. Our data processing service includes:

  • Media Receipt and Verification: After data collection has occurred, Consilio will receive the media with a standardised client manifest in one of our processing facilities.
  • Cataloging: Object extraction will occur down to the lowest possible child level, along with creation of the data’s parent-child relationships and metadata extraction. The cataloging step includes deNISTing of the extracted contents against known system and software files and metadata culling
  • Deduplication: That deduping protocol compares the unique DupeKeys for each document either against the entire project corpus (global project-level deduping) or within the custodian corpus (custodian-level deduping). Unique documents will be flagged so they can be retained in the database. Duplicate documents will be withheld from promotion into the review software to reduce the total hosted volume. Duplicates will have pointers to the master native documents.
  • Reporting and Analysis: Clients get reports on processing exceptions as well as data composition, along with analysis and recommendations of how the data set may be further culled
  • Import Into Review Software: Consilio will perform an import of the deduped, culled processed data into the eDisclosure review platform of choice.