Video: How to Collect Data Proportionally Without Blowing Budget | Duration: 2120s | Summary: How to Collect Data Proportionally Without Blowing Budget | Chapters: Welcome and Introduction (4.7999997s), Product Overview: ONA (241.66501s), Data Collection Methods (577.675s), ONA Data Collection (903s), Data Source Prioritization (1627.795s), Modern Attachments Explained (1756.3151s), Concluding Logical Discussion (1827.37s), Q&A and Conclusion (1878.34s)
Transcript for "How to Collect Data Proportionally Without Blowing Budget":
Hello everyone and welcome to this webinar, how to collect data proportionally without blowing your butt shed. So this webinar is part of a whole series of webinars that Andrew and I are doing for internal investigations. So we will focus more on the collection part of internal investigations, where to get your data from, and, we'll focus here on the products. This webinar is going to be recorded. So afterwards you are able to review the webinar again if you want to or share it with your colleagues. You also see on your right side of the screen a Q and A option There you can list your questions and afterwards we will answer those after the presentation. And, also you will see a refill demo button on your screen and if you would like to have a personal demo of the solution that we today show then you can click that button and leave your details there and we will contact you for a meeting to show you personally the information that you would like to see. So a quick introduction my name is Daniel Schwing, director of product strategy at Reveal, and premierly focused on product management And, with me again today is Andrew. You're going to be my, my webinar buddy. Thanks, Daniel. Hi. Andrew Punter. I've been at Reveal now for two and a bit years. I work in the strategic sales engineering department, helping showing Reveal and all of our Reveal products, to wonderful prospects like yourselves. So thanks, Daniel. Yes. So let's get started. We'll have a quick introduction about Reveal, and, the products. And after that, Andrew will do a more extensive demo on the collection options that we offer. We have done in this series this section more often, so I will not talk too much about it. But, just to summarize, review, develop solutions for eDiscovery, eDiscovery end to end solutions. So from start to end we are able to help you with your eDiscovery cases and eDiscovery cases can be litigation cases but also internal investigations, freedom of information requests, and other types of, use cases where you need to investigate information, find the truth. So, the company timeline of Reveal, Reveal, has acquired over the years several companies and technologies that are integrated with each other. And just to name a few of the important ones, the BrainSpace acquisition that we did in 2021 for assisted review, was very important. The product is integrated into our solutions. But also more recently, the Onna acquisition that we have done, which allows us to connect to a multitude of data sources to collect information. And that also be one of the focus areas of today where we'll focus on the collection of data. If you look at the product overview, that supports and the end to end e discovery, we have, Onna. Onna is the product for connectivity to collect data. It can connect to various cloud sources directly. We have the solution for legal holds, more focused on The US market to notify potential custodians and also, do interviews and send out questionnaires. Then we have logical and a very simple and easy to use solution for your ediscovery needs. Simple in review, You can quickly learn the product and get started with it. And then we have review the AI based review platform where we have a lot of features leveraging the latest, AI technology. One important part there is Ask, and Ask is reveals GenAI solution to basically chat or ask questions to your data in natural language, and the GenAI system of review will provide you with answers to your questions. And then last but not least is a trial director which is a tool that will help you in courts to present your case, and that helps you, with, let's say, the presentation of your evidence. Today's focus is on data collection and the data collection capabilities of Onna and Logikcull, and, of course, also the way these two products integrate with each other. How can we, leverage Onna in combination with Logikcull or use them separately? So that's one of the, the questions that we will answer today. A little bit more about Onna. Onna has a lot of data connection capabilities. So we offer over 25 out of the box connectors for, for example, Google work space, Microsoft Office three six five, Zoom, Dropbox, and other, applications which store data. And our connectors can help you to more easily collect the data from these systems. After that, we also make sure that all the data is being processed and made searchable so you can quickly search through all the information that is collected. And then what you can do is that you can move the data into, your review platform, like, for example, Logikcull. With Logikcull, it's a very easy to use, easy call free solution. It integrates also indeed with Onna, of course, and it allows you also to do, collections of data. But also very important, it has a, advanced processing engine to process all your data to make everything searchable, and then it gives you a very easy to use interface to search but also to apply labels and text to your documents and in the end produce your data in the proper format to hand it over to the parties that need the information. Like I said before, we will discuss today the data collection parts in investigations. And there are a couple of challenges there that we want to address today, which Andrew is going to show us. And of course, the growing number of data sources We see that there are more and more systems that people use besides the standard office environment products with also products like Slack is used a lot. And many people struggle with getting data out of these type of sources. So the various sources and also mobile devices and others, those can be an issue and are a challenge in many situations for an investigation. Also a problem is that you might want to or that you can collect too much data, So, you may not want to collect, for example, a full mailbox but only parts of a mailbox, from a specific period, because if you collect just simply all the mailboxes, that may lead to excessive costs. And, of course, if you have a lot of data in your system, you also have to review more data or research more data. Another challenge is, of course, if you focus too much on specific data that you want to collect, maybe based on keywords, then you might miss key evidence. So also there, there might be a risk there missing out on data, and you do want to prevent that. But, of course, this various countries in Europe have also different types of regulatory requirements. For example, proportionality of data collection is one of them, but also, the GDPR, personal data collection, all the kind of regulations can add complexity. Now if we look at, let's say, the types of collection that we see, we can summarize them between targeted collection and broad collections. And a targeted collection is where you really focus on specific data that you need to collect. And I've seen that with some of our clients that actually review for example the data first before they add it to their e discovery system so they are highly focused to add only data that is relative for the case which is time consuming upfront but it may, yeah, lead to limit cost and speed of the the processing. It focuses only on the the relevant data, but, yeah, of course, with that with that strong focus, you also may miss, certain parts of evidence. So targeted collection is not always the right approach in some cases. And in some cases a more broad collection is preferred to ensure that you don't miss any of the data. Downside there of course is that it can lead to a lot of data that you're going to collect with a lot of processing costs within your e discovery process. And that's something you also would like to prevent. Now, one of the things that Andrew will show us later is how to combine these two, have, let's say, a proportional approach starting broad and filtered down efficiently with, both owner and Logikcull. And I think it's over to you, Andrew, for the demo. Thanks, Samuel. So, yes, we're gonna be talking about data collection from two of our platforms. And then towards the end I'll be showing you how we go from broad into very targeted and filtered so that you get the, the safety of being able to do a broad collection, but the cost saving for of just having a targeted one. We're going to start in Logikcull. On previous webinars, we've shown you some of the functionality of the Logikcull review. Now we're going to talk a bit more about the Logikcull uploads. We are currently in the projects I've just made, and there's no data currently in this project. We are at the upload stage of Logikcull. By creating a new upload, I can, do a automatic file upload. However, for collection purposes, I'm gonna be using our cloud uploader. Our cloud uploader can connect into four of what we believe are the most prominent business, tools within, within most organizations, and you can very easily target a specific user at a specific time. So one of the more challenging data sources is Slack, and we can commonly see I need to collect a person's, IMs, Slack's, Teams, as well as their email email accounts from a certain period. So within Logikcull, I can simply do that. I can connect into Slack. I don't need to be an IT or I need to depart my IT department or top my IT department. I can do this as a compliance number, as a member of HR, as a member of legal, assuming that the right permissions have been set, within your company and within your policies. Now coming to our wizard, I can use a credential which can give me access to identifying different people. So let me identify Steve Pritchett, and I can choose what type of his Slack data I want, direct messages, channels, public channels, private channels, including attachments as well. I can give my custodian, give my uploaded custodian. I can also make sure that I give it a name so I can refer back to it later, and I can apply that date range filter. Here, I'm going to be looking at the first part of, 2021. Yeah. This is quite a common, use case within most internal investigations within within companies and corporations, but I've instantly identified my date range, my person. We now go into Logikcull automatic processing, and we're gonna upload directly from here. Now if I want to add continue to add new data sources, I can come into here and, for example, I can identify Steve Pritchard's mailbox. Again, I can identify the date range that I want to look at. Maybe I'm only focusing on that specific month. I can identify my custodian and, again, I can name it, before. One nice thing before I hit go is I can see how many emails, are actually going to come across into my logical review tool, making the import for emails incredibly quick and also proportionate. You can see how much data is going into your system before we do it. Now this, will go into processing and uploading before I can further go into review and load that data in. Now issues with this include I can only look at targeted people and times. If I have a huge number of, of custodians and data sources, this can become quite laborious, which is what Onna can help you out with. Onna can offer solutions, towards the left of the EDRM. I have the ability to, connect to multiple data sources across my business and search them all the same. Onna will continue to build index indices over specific data sources, so it will mirror what is in your production environment. It will also follow retention policies, so that you're not collecting data that no longer needs to be retained. I'm gonna come into my gold project, and I'm gonna see the different connected sources I can I am already connected to? Now if I wanted to add more, I can add more easily by clicking the new data source button. We can see the list of all the different low code connectors I can connect into. Maybe I'm after more emails. Maybe maybe my company has multiple different tenants of, of Microsoft three six five. This is quite a common problem because if if I had this in logical, I would have to do multiple different connections and and collections for a single custodian if they had multiple different tenants. Within owner, I can, apply give, create indices over all of these tenants and very quickly search over a specific person. If I come into the Slack Enterprise Connector, I can see the two different types of collection I can do or synchronization. I can either do a one time sync. I can do a snapshot of data between one time and another time. Perfect for if I'm after a targeted collection over a specific period. The other one is auto sync and archive. This will essentially take your Slack, and it will continue to build a continued archive. I can incrementally increase this, meaning I my my Slack archive mirrors what is currently in production. I don't need to continually increase my upload, which means an incredibly fast collection and search and ECA of my data prior to it going into a review tool. At the top of the bar, you'll see our search bar. Now many people will come in and say, I need to identify all documents within my business that apply on specific key terms. Before searching across it, I may want need reporting to internal compliance, and this is where our search term reports come in. On our internal I've already created one in very blue piece of fashion called internal investigation. I can see how many documents I've run this over and as well how many documents are inclusive into my search. These keywords have been defined by, by counsel or by my legal team, and these are the terms I need to search over and collect data for prior to my team doing a review. I can export this in a reportable format and make sure I'm not over collecting more day, data. Once we're happy with the search term, that search can be, can be done. In my search bar. I've recreated that search time report using my search builder. I can add additional fields, I can add additional filters in here, maybe they're date restricted, maybe I'm only after specific people's data or specific participants. By searching across this data, I can see the different types of or different places this data is coming from. I have SharePoint, Outlook, Gmail, all within OneSearch, all within OneSearch syntax as well. This is incredibly powerful when it comes to, preserving time as well as identifying relevant material. If I jump into one of the documents, I can pre review to make sure this is can actually responsive and this these keywords are hitting on responsive, documents. I can see within the extracted text where my keywords are are hitting upon, and I can also do elementary review and identify my metadata. I can tag this for export. Once I've had a look at my data and filtered down to any further, any further requirements, I can simply save or tag these documents, I can save this query, or I can copy it into a folder. I can also export these documents ready for further review and analytics. Now this is where we have the beauty of having both targeted and broad spectrum collections, within the Reveal suite. By clicking on export, I have a vast number of of different, things I can do. I can export it directly to be loaded into a review tool. I can, for for chat data, I can generate r s m f ready to again go into a review tool. But one of the cool things here is I can export directly to Logikcull. I've set up my connector with Logikcull, which now now identifying the project, which is the one we were in earlier, and I'm gonna direct it to a folder. I can also apply notifications. Just before this webinar started, I kicked off this this export, and you can see as my export is going forward, we are giving you metrics to make sure you know exactly what's happening as well as an export status. This is now going directly into Logikcull. So within a couple of hours, I've identified I've had an internal investigation request. I've identified my data. I've identified my date range. I've identified any keywords that need to be collected. The population has been set, and I've pushed it into a review tool incredibly quickly. Daniel, back over to yourself. Yes. Thank you, Andrew. So just to recap what we just saw, Andrew. So, you started out with Logikcull doing a really targeted collection for Steven Richards, if I recall correctly. That's correct. Right? So focusing really on a single single person, then you shows ONA where initially you showed that you can also do their, let's say, targeted collections, but on a, let's say, larger dataset, so time periods. But what I really like was the the auto sync option. So, basically, what you do with the auto sync option that's let's say let's say the data that you have in your organization, you actually build a ready to use kind of or, yeah, an e e discovery ready archive, basically. Absolutely. We're we're seeing more and more commonly, Yeah. Again, I hop back to this scenario in these two tools as they are becoming an increasing challenge for for businesses. And most people identify within the DSAR specification. I need to know where my name appears in all of my data, all of my all of the company's data. This is an incredibly big ask for a corporation. For to do it in a classic ediscovery tool, you would need to bring in every single custodian, and you would need to identify where, specifically, the one or two documents within their emails, their IMs, where my name would appear. In owner, that's already created. I can run that keyword search, and I can then upload it into Logikcull to prepare data for production as we've seen before with automatic PII detection ready for review and export. Yeah. And that saves you also a tremendous amount of time in collection. Right? So you've auto sync, your archives. The only thing you need to do is go into Onna, perform your search, or the queries that you have listed and with the the terms that you wanted to search, kind of the bad search option. And then when you're satisfied with the results, you just export it to logical and your review team or your HR department in case of these hours can start reviewing your documents. So that is a tremendous amount of time that you save by performing, let's say, AutoCYN can do, let's say, an upfront kind of broad collection. Of course, very important there is also, I think, to remind people that Onna is can be, set up in a very secure way. So make making sure that not everybody has access to all your data and just to prevent any questions on that. You can really limit, the number of users and people that can access that system. Let me go back to the presentation. Just a few things to go over after we've seen the demo. Yeah. Starting broad, we, yeah, kind of discussed that just now. Then refine, and don't over collect for your review system because then you need to review too many documents. So make sure that you can filter upfront. Of course, leveraging the automation, leveraging the collection, if you need to do, if all your data is synced automatically, you are ready to go. No need to set up any configuration anymore. It's already done. Maybe also a good tip. We did not really discuss that, but there also always some, let's say, data sources that are more important. From my personal experience, I know that's, many clients, focus on mailboxes of people and leave alone the SharePoint side of it because most of the documents that are in SharePoint have been emailed to each other. So nothing probably no need to go there. So I see that many of our clients prioritize still email. I know that in Slack, less documents are shared, for example, but there is more, let's say, discussion in there, of course. But I think from a priority point of view, I see that, people use email and Slack a lot. I don't know if that's your experience as well, Andrew. It's definitely increasing. We're seeing email and Slack, which also, something I failed to mention earlier, brings up the the the troubling thing about, modern attachments, and and how we can collect from there. Onna is incredibly good at modern attachments. It has extensive connectivity, as as mentioned in the slide, to to collect those modern attachments directly from Google Workspace, Google Drive, Microsoft SharePoint, depending on where you're you're you're linking them to. So it's another massive bonus for Onna. Yeah. Can you explain because maybe not everybody is familiar with the term modern attachments, but I know if I want to share a document from my OneDrive, and I will send it to you, Andrew, it will send you a link instead of the actual document or though I can choose what I want to send, but, normally, I send you a link. But that's what you mean by modern attachment. Right? So the links that you share of your documents, so the documents stay in their original location. You just simply send out the links, and that's basically what, in the industry we call, modern attachments. Right? Absolutely. From your perspective as an e as a email user, you'll see no different. You'll see, Daniel email yourself with an with a email that has an attachment. Now that attachment doesn't actually exist. It's just a link to a, a path on on SharePoint. Owner can follow that path and collect the single file and attach it to the email. So from a review point, you will see an email and an attachment, not an email and a link. Maybe you should also do a a separate webinar on this one, Andrew, later on. Let's just show this off as well. The show is off as well. Yeah. And I think, yeah, specifically logical, if you have to do a quick review of documents and you only have a small amount of, documents to collect or you can do it's really targeted for your case at that moment, I think Logikcull is a great, great solution for that. No need then to go into, Onna, but Logikcull also has a a comprehensive set of, collection options for you to use. But I think, indeed, more focused on the targeted collections. So I think this ends our presentation. It's time for the q and a. So as mentioned before, on your right hand side on your screen, you see a q a section. You can ask your questions there. And, yeah, let's, start with the question. Let's see what's comes in. K. So, yes, we have a few questions, Andrew. The first question is, what cloud data sources can I currently collect from? I think you showed a few in the the owners. Absolutely. Yeah. I I I showed a few within the, within the demonstration, but we connect from a huge variety of data sources. So we connect from all the Microsoft three six five, within owner, all the Google, different applications we can do, so mail, chats, as well as, workspace. And we offer some some, connections into the Atlassian products, so Jira, Compliment, etc, Zoom. We also have a web based crawler so you can select different web pages and get a snapshot of what they are at at the time as well. So we we have a a wide variety of different cloud connectors that we can collect data from. And if a data source is not available on the list, how would that work? Great question. Great question. So there's two options here. So first of all, only can collect from a folder. So you can upload data into that folder that would then be built into your index and you can search across it the same way you would search across other data sources. The other way is we have a full platform API, development, suite. So you can create your own APIs and your own sorry. You can utilize our API library, to develop your own custom connections into your owner repository. So you can build out, we can give you the tools so you can build out connections and you can collect from, from non specified suites. Yeah. And I think maybe also good to add that we have a few different, import formats that we support, so not just plain files. But you can also use the default industry formats with load files. And load files are files that are basically kind of database files, comma separated, with comma separated values or Excel sheets that contain metadata of your documents. And, yeah, we support the, let's say, the default industry load files as well. So if you have an export with metadata and files, yeah, we can easily import those as well. And, yeah, I think also maybe if people wonder about chat collection from mobile devices, for example, typically, the the tools that can capture data from mobile devices, like one of our partners, ModeOne is doing, their export can seamlessly be imported into the tools that we have. I don't see any other questions popping up. Thanks, Andrew, for your time and your demo again. We'll hopefully see you in the next webinars that we will be organizing. Thanks everybody. Thank you for joining.