Spreading Like Wildfire:
Emerging Data Sources Continue to Spark Governance, Compliance, Discovery Challenges
Download the Full Report
Learn About the Challenges Presented by Emerging Data Sources
Glossary of Terms
Concerns over emerging data sources have reached an all-time high.
After several years of steadily burning in the background of legal department challenges, the flames have caught on and grown, forcing in-house counsel and their law firms into fire-fighting mode across e-discovery, compliance monitoring, investigations, regulatory response and more. Over the last year, 62% of global general counsel confirmed they had experienced distinct new issues associated with collaboration tools, chat applications, file shares and other similar cloud-based systems, and nearly all (93%) expressed some level of concern about emerging data sources as a significant area of risk.
Related Resources
Glossary of Terms
- API: Application programming interface, typically created by technology companies to provide secure, consistent and trustworthy methods for extracting and exchanging data and developing solutions. Are often integral to collecting data in an investigation and converting it into a format that can be ingested and reviewed. In a forensic collection, APIs may be used for example, to retrieve a native file and other content, capture specific metadata and/or retrieve versioning information.
- Audit logs: Documentation of activity within a system to record the occurrence of an event or change. Details contained within may be relevant to evidence discovery for disputes and investigations, and for emerging data sources, audit logs may be difficult to obtain or otherwise add complexity.
- Channel-hopping: Describes the behavior of starting a conversation in one venue (e.g., email) and continuing and/or concluding it in others (e.g., Slack, WhatsApp, etc.). This activity creates a complex web of interrelated data and messages dispersed among many disconnected platforms and devices. Increases data volumes and dilutes conversation context.
- Chat parsing: A highly technical, often customized, methodology used to convert a short-form message or other information and artifacts from emerging data sources into formats and context that are compatible with e-discovery and investigations tools and workflows and may be reviewed for relevance and meaning. May also be referred to as “documentization.”
- Cloud-specific metadata: Data that contains unique information about related files, such as timestamps, authors, file size, etc. It identifies properties of the file and may specify how the file should be handled when it’s accessed. For cloud-based data, metadata is often more detailed, complex and fluid.
- Connectors: Technology solutions that provide direct access to applications that contain data that needs to be preserved, collected, processed, analyzed and/ or reviewed, and allow for data and documents to be accessed and extracted in their native form.
- Data spillage: Exposure, breach, unauthorized or accidental sharing of sensitive information, which may occur from inappropriate or ungoverned channel hopping and/or use of emerging data sources and off-channel communications.
- Digital whiteboarding: Virtual workflow for traditional whiteboarding processes, providing a dynamic online canvas for collaboration. Such applications are provided as integrations with common cloud-based productivity suites such as Microsoft 365 (Teams) and Google Workspace. Creates new forms of dynamic documents.
- Dynamic documents: Files that are hosted in cloud-based systems or other fluid environments, which allow for continuous and real-time editing, sharing and replication. Often shared between numerous individuals and groups with varying levels of access and permissions, ranging from viewer to contributor to editor to owner, etc.
- Ephemeral messaging: Mobile-to-mobile transmission of messages and media that are automatically erased upon receipt, usually immediately or after a short period of time.
- Linked content: Dynamic documents and other “live” files that are shared between parties as links rather than static attachments. Also referred to as “pointers” in the e-discovery industry.
- Off-channel communications: Any business-related communications, messaging and file sharing that take place via unsanctioned, unpreserved, unmonitored or “shadow IT” systems. Often used to refer to ephemeral messaging and other forms of mobile chat.
- Permissioning: The creation and multiplication of unique and widely varying shared access parameters for a dynamic, cloud-based document, across any number of users or groups. Permissions may vary between participants, active participants, inactive participants and other types of users.
- Reactions: Non-text responses to messages (e.g., thumbs up or heart reactions), typically in chat and collaboration tool environments, to acknowledge a message and indicate meaning. Reactions are logged in many emerging data sources or text records, and may be retrieved/collected as data artifacts during the digital forensics process. Processing and parsing them for review and to decipher meaning may require complex technical analysis.
- Shared access: Nuanced roles for any number of users relating to a dynamic, cloud-based document. Adds complexity to custodian identification and data authentication, while also increasing data volumes. Shared access may vary between participants, active participants, inactive participants and other types of users.
- Short-form message: Describes any message or string of messages, media and reactions shared between parties via an emerging data source or chat application. These are significantly different in format, tone, purpose, contents, structure, context and use from traditional message formats (e.g., email). Require unique and technical workarounds to process and parse together in a manner compatible with e-discovery and investigations tools and workflows.
- Versioning: The proliferation of numerous and persistent versions of a single original file (stemming from dynamic document environments, shared access and permissioning), which may reflect countless changes and/or a wide range of varying access and permission parameters. Complicates the process of establishing a clear and accurate historical view of content.