The Database Dilemma in ESI Processing and Review
As both sides in discovery become more knowledge about electronically stored information (ESI) they also get more knowledgeable on what types of ESI often gets overlooked during the EDRM identification, preservation and processing phases. Today we are going to talk about database (DB) files.
It is very common during a data collection to collect database files even from a user’s laptop. These database files are typically not humanly readable without trying to view them through their native application and hence may be difficult to review or produce in any form.
WHAT TYPE OF DATABASE FILES MAY FALL TO THE FLOOR?
According to the folks at file-extensions.org, the most common databases are SQL databases such as Microsoft SQL server, MySQL, Firebird, SQLite (very common from Apple IOS devices-IPads etc), Oracle, IBM DB2, Microsoft Access, Microsoft Visual FoxPro, dBASE, FileMaker.
If you send your data out to get “processed” without specific direction on what to do with database files including popular files extensions related to databases db, .dbf, .accdb, .mdb, .mdf, .cdb, .fdb, .csv, .sql, you could:
1. Get nothing back
2. Get an error report
3. Get some meaningless images
4. Get some slip sheets
5. Get some tiff images that look like spreadsheets making your feel like you got what would be expected (and be wrong)
6. Get charged a lot and not have a defensible log of what was done with your data
7. All of the above
So, what should you do? First, the Sedona Conference has published some industry standards on how to deal with DB files. http://electronicdiscovery.info/forum/e-discovery-news/27564-sedona-conferencea-publishes-database-principles-electronic-discovery.html
I believe the Sedona guidelines are a great starting point for you and your case team, they recommend:
1. Absent a specific showing of need or relevance, a requesting party is entitled only to database fields that contain relevant information, not the entire database in which the information resides or the underlying database application or database engine.
2. Due to differences in the way that information is stored or programmed into a database, not all information in a database may be equally accessible, and a parties request for such information must be analyzed for relevance and proportionality.
So, you may not need to produce your databases. However, you need to know what you have and get agreement with the other party on producton format/cost shifting etc.
If you produce the database you can’t just assume the spreadsheet looking images your imaging tool generated are acceptable…….
4. A responding party must use reasonable measures to validate ESI collected from database systems to ensure completeness and accuracy of the data acquisition.
5. Verifying information that has been correctly exported from a larger database or repository is a separate analysis from establishing the accuracy, authenticity, or admissibility of the substantive information contained within the data.
To make any decisions on what to do with your databases you need to know what you have. You should be able to use an early case assessment module to quickly\cheaply get a listing of all the file types in your collection (including anything contained in email\zip files. Then you can:
• Make a determination if they are potently relevant
• Present your approach to produce to the other side
• Review the databases for responsiveness\Privilege in your review tool
• Mark and produce the DBs as agreed to with the other side
If your review tool cannot open the databases up to review\query, the tool should allow you to launch the native database application (i.e. SQL) so you can run queries and export out results in some CSV format.
WARNING. Some early case assessment tools and methodologies suppress these file types before the legal team has had an opportunity to determine if they are of interest.