Video: A Cost-Effective Approach to Managing DSARs | Duration: 2708s | Summary: A Cost-Effective Approach to Managing DSARs | Chapters: Introducing DSAR Management (11.04s), DSAR Challenges Explored (104.425s), DSAR Process Management (191.145s), DSAR Process Overview (331.57s), Data Request Process (499.29s), Data Processing Pipeline (829.51s), Search and Filter (1079.46s), Filtering and Redaction (1389.545s), Finalizing DSAR Production (1831.955s), Q&A and Conclusion (2316.69s)
Transcript for "A Cost-Effective Approach to Managing DSARs":
Hello everyone and welcome to this webinar about the cost effective approach to managing DSARs, data subject access requests. My name is Daniel Skuring. I'm part of the product management team at Reveal and also responsible for Logical. And I'm here together today with Andrew. Andrew, can you do a brief introduction? Sure. Thanks, Daniel. Hi, everyone. Andrew Panzer, senior sales engineer at Reveal. So I'll be talking you through the platform, where we can help with DSAR matters, and how we can help streamline and make this as cost effective a problem as possible. Yes. And for this, we will be using one of our free tools products to demonstrate this, Logical. And Logical is our secure cloud based eDiscovery and records request platform and designed to make data tasks like DSARs. There's also public records requests, fast, defensible and easier to manage. It's a very simple solution, automated indexing and powerful search capabilities to allow legal HR to quickly locate, review, redact, very important part of the DSR process and produce the information they need without relying on IT. There's a lot of information about DSAR available, also a lot of numbers in the market. If we look at the importance of these are, we see based on the numbers that, say, large companies in The UK may opt to receive 28 requests per month, which I think is quite a lot. It can be yes, definitely be a big drain on the resources of, I think most of the times, the HR teams and the legal teams there. And not always those DSRs will go as smooth as people want. And also the people receiving the data, they have a lot of complaints about this because not all the data is there or the data is not delivered in the proper way as expected. So we see also that there's a lot of increase in the complaints to the information commissioner's office in The UK about, yeah, the complaints they're getting. They received, I think, few years ago, almost about 15,000 complaints on the DSR process. So that's definitely quite a lot. And, Andrew, I know that you have been working in the past handling DSARs and managing DSARs. Right? Can you tell us a little bit more about that? Yeah. Absolutely. Thanks, Daniel. Yeah. So before my life at Reveal, I was working as part of a consultancy, managing the larger of a financial corporation, which gave me a huge amount of understanding in terms of what helps the process, what really hinders the process, how efficiencies can be run, and also how to manage the panic, especially if this is one of your first or you're starting to come up with more more regularly, this is starting to become a real financial burden or time burden on your more senior members of staff. How do you get HR going back to doing HR things, legal teams focusing on genuine legal issues, and not being sucked into what is essentially a huge time constraint for a company, and especially companies of kind of middle size moving into enterprise level where they may not be set up for regular requests or regular internal requests, especially from employees. Now a few other key areas of this is just because this is a a request for personal data doesn't necessarily mean it's not going to turn into something moving on in the future. So planning from the get go or from the date of the request is hugely important. Does the potential or does the data subject potentially want to conduct further legal action? Maybe that's for an employee or an ex employee, disgruntled employee, a dispute over pay or termination or something like that. Does legal need to be involved much earlier on? And as Daniel mentioned, Logical is a very secure eDiscovery solution. So not only can it handle DSAR requests, it can also handle employment litigation, employment tribunals, Those kind of requests, which may start as a DSAR very easily can become those kind of matters. So, absolutely, experience is is is kind of where I've been from, so happy to share the my my my kind of key tips. Yes. And one thing I want to clarify because sometimes it's not always clear for people. Today, we will talk about DSARs. And, basically, as a result of disputes with employees or former employees, right? Because I know you can also do a DSAR request to Facebook or Google or any other large IT vendor or social platform to get an overview of the data that I have about you. And these are all kind of automated processes because this data is all stored in structured databases. So which page did I visit? What post did I do? What information do you have about me? It's all stored in records in databases. But in the use case that we are discussing today, it's more about unstructured data. So what was being said in certain emails, in chat messages, and other sources of communication. And there's, I think, a challenge there compared to, let's say, handling information that is available in structured data sets like, let's say, standard databases. Also, think, yeah, these are we're already doing this for, I think, eight years or so. And I think it's definitely not going away. So also for organizations, I think it's really important to be ready for this. Also, I see differences within our clients of clients that have frequent small DSRs, DSR requests. Very just a few documents. It's very frequent. But I've also been involved in cases where there were DSARs that were really kind of large and involved, let's say, 10 thousands of documents, which is, yeah, quite a lot to manage, especially when you have all the deadlines that come with the DSA requests. I think, yeah, Andrew, I would like to see a little bit of what we can do with Logical. And Andrew prepared a demonstration with a few parts in there, identifying and uploading information that is required for the DSA requests, the searching and filtering that we do, the review and redaction, and in the end, the how do you get the data out of the logical system to send it to the requester? Absolutely. So we're gonna start off with what typically well, what what what a request could look like. Now here, we're gonna follow through the story of mister Pritchett, who used to work at Globex, a large corporation, and he is requesting his own personal data from his management team. So we can see in the second paragraph, we can also see the kind of documents he's after, communications. And this request here has been targeted around his promotion and performance. So immediately, I could have potential alarm bells ringing of this could lead into future future issues. But we're also going to identify certain members of his management team, mister Burris, miss Acres, miss Capula, and mister Montague, who also could be involved, and that's the data that we may have to or will have to investigate. And, again, the request includes more more provisions, I e, where he would like data from, Slack internal messaging, emails, any performance reviews, documents, etcetera, and internal notes. So, again, it's a large, broad spectrum. And if we were to give this to an IT team, this could take a huge amount of time to come back. Now one of the things Daniel very much mentioned at the start is that time constraint of either thirty days according to the ICO to to respond. Now whilst this webinar will not go into kind of the the legal elements of what is and what's not or an appropriate request. These are things that need to be discussed internally or with your own counsel. Is this request proportionate? Is this request appropriate? Do we have the right amount of time? And therefore, any other legal action must or might need to be had or discussions with the ICO on any extensions. Now, again, we're not gonna be talking too much about that process. We'll be talking about the actual response and how we can help get what mister Pritcher has requested back within the time frame, but also in a sanitized way so that any business or or or trade or privilege material or or anyone else's PII is not unduly disclosed or there's been any data leaks in that sense. Cool. So what we're gonna do is use the logical tool to go through a project. Now we'll see that where I am right now is within a specific project in this gold matter. Each matter or each potential request should be handled as its own project to make sure the data is contained all within the four walls of your logical platform. As Daniel mentioned, we're gonna go through uploads. Now we are in the uploads section now of the uploads module. This is all about getting data into the tool and how easy it is to get data in. We're gonna conduct searching, and then finally, we're gonna get data out. As you can see from the tool itself, I've already done some of the the initial searching. So we can see I have Ariana's. I have mister Burris's emails already provided. I have some of the the sample for or the the the notes, internal files that we may have and some of the Slack data already collected via Logical's connections. Yeah. So each section, what you see on the screen, is basically an upload of data to the system. Right? You don't have to collect everything in one go. You can just upload the parts that you find that you need and individually upload those to the logical system. Yeah. Absolutely. And the beauty about this is I can very easily manage this as part of the investigation team. I don't have to necessarily go to my IT team. I can, if I have already, set up some cloud connectors and collect directly from the data source. Now this becomes and helps with the more challenging dataset. So, for example, Slack messages or Teams messages, being able to draw proportionately the exact people like I'm going to do now. So, for example, we haven't already done missus Akers' Slack data. So I can connect in to Slack. I can specify my person or my person or participant within the Slack area. I can select what part of their Slack universe I want, whether that's only just direct messages, public channels, private channels, or even not include an attachments if I wanted to. I'm gonna continue with everything. I just give myself a name, miss n a, and give it a start and date range, and then away I go. What you've done now yeah. You've you've now just create a simple selection of somebody's, like, data. Right? That you just Absolutely. Importing into the logical system. Yeah. And what we're doing now is a really cool process. So we are extracting the data from that system. We are then going to process that data. So for emails, files, PDFs, Excels, Word documents, those kind of things. We'll also extract all the text, the metadata associated with those documents. We'll for emails, we'll run threading, so to identify documents with the last thread. So, for example, if Daniel emails me, I reply. Daniel replies back. I forward this on and include my boss. That thread can become rather large. But, actually, I rather than each reviewing those individual messages with the other ones below, I might only want to review the last email because that will still include all the other communications below as well, making my review faster, quicker, and more efficient. We'll also run things such as deduplication so you don't have to review the same email or family of documents more than once. You can review efficiently, quickly, and without any duplicate effort. And finally, we'll index all those extracted items so you can have reliable search terms run against your documents so you can identify the relevant documents quickly and efficiently. And this process, we support this for various formats. Right? So if you look at chat because chat is becoming more and more important or more and more, it is important. So we support Slack, Teams data, but also manual imports of certain systems. It's not Besides, of course, the the traditional yeah. Look at me saying traditional email, but, yeah, I know Microsoft three six five, Outlook, Gmail, and all those sources can be imported directly from the store systems. Right? Yeah. Absolutely. With logical simple drag and drop, if I didn't want to connect my logical t tool to my data sources or I was working on behalf of my client, all I could do is simply drag and drop that exported data into Logical, and Logical will continue with the same processing that we have seen for the Slack data or for other people's data as well. Awesome, Daniel. Any other points on the upload area or getting data in? No. I don't think so. I think it's always yeah. The complexity is here. Where is the data located? And, of course, people can communicate in various ways, but I think covering at least the chat data, the email data, we already covered, let's say, a lot of communication channels with Logical. Yeah. Of course, the conversations, the live conversations, we're not there. So those are not recorded. But, yeah, everything that's written down or that's there's a digital evidence somewhere, we can store that into Logical. Absolutely. Cool. So once all the data has been collected, I now need to actually conduct the meat and veg of my my. Now we can see when I've jumped into the system on the far left hand side on the uploads area, all of those uploads are available and easy to filter on. So if I wanted to so focus solely on one part of my my data, I can do just based upon the upload. Now in terms of orientation within the search module, at the very top, I have my search bar. This search bar gives me access to kind of simple keywords, or I can even write, as you can might be able to see in the the text there, a search syntax. Now sometimes that's not very that's not massively easy. So Logical has a very simple search builder where I can use all the metadata that's been extracted to navigate down and filter down to my specific documents. So, for example, if I'm after a specific email subject or email from two cc, anything like that, I can type that into here and create different conditions. Now, typically, for a DSAR, we see the main searching and culling area is all done via my bulk keyword search. This is where I can identify specific keywords. And now for this one, we have obviously gone for Steve Pritchett, who is the data subject. So that is not something I'm going to be focusing on. So, for example, I can do Steve and Pritchett. And I can also do continued ones as well. So this helpful search syntax, this little pop up is gonna help you if you have any queries or want to make your searches more expansive and more or, actually, they're not more expansive, more restrictive, but in terms of what you want to do within that. So this can include proximity searches, wild cards if you didn't know the correct spelling. For example, if it was Steve or Steven, using that would be incredibly useful. Now I also want to identify where mister Pritchett is also being used under promotion. So I'm also gonna have two other ones saying promotion, and then I also want to have another one which says pay rise. So this is gonna help me really target down and focus to see where my documents are hitting on these specific keywords. And what I can do here is run a test search. So this is gonna tell me how many documents I have or how many documents I need to review for this area within here. Now the search being quite broad, being I need my personal data and performance related items, promotion, pay rise, performance, maybe another keyword I want to do. We can do some a lot of a lot of trial and error to find the dataset that actually needs reviewing as well. And this can be brought back to and from. Yeah. I think you see here very interesting that by just adding one additional keyword here, promotion, you already go down from 1,200 documents to 20 documents. And it's kind of key to really dive into the true relevant documents instead of, let's say, also the, yeah, the noise that we have imported into our system. Absolutely. And the the the key thing about this, doing this at this stage, is every single document has been indexed. That includes attachments to emails and all files have brought in. If these searches were run precollection, sometimes not every single field or file is indexed for these terms. So you know you're searching over everything, and therefore your business risk is much much much smaller. Yeah. And you focus here on finding information that is relevant to the request that's being done, so about promotion, pay rise. But I also know that we have a lot of clients that have created special queries to filter out data that is by default not relevant. So if you collect a mailbox from somebody, there are definitely lots of emails there that are not relevant. Like for example, newsletter information or even personal data. There's still people that are using their work email for their personal communication, talking, emailing to their spouse and all that information. You can also quickly filter that out of the system. And we call that culling it so that we set it apart and it will not interfere with, let's say, the other investigative searches that we are doing. So searching is not just for finding relevant data, but also very useful for excluding data that is not relevant to the case. Yeah. That's a really, really good point. And that's part of Logical's name. In the cull area. I can remove or cull that data from my searchable set, only focusing in on the relevant or the potentially relevant data. So I've come to this specific keyword. I can search everything if I wanted to, but I want to focus in on those 20 documents. Now if I really wanted to focus in on a lot more, I can use our filter carousel and our filter area to search specifically over some of those areas. Now I can see I now have a mixture of different emails, and I have different Slack conversations that I can really focus in onto those. Now if I wanted to Sorry. Can't be more Oh, go ahead. Yeah. Just wondering, Mark, because I think this is one of the very important filter items that we have in place. Especially with these are there's often a request for communication between two people or maybe three people. And it's just here in Logsco. You can just click the email person, email from and email to, and you zoomed in on all the communication between these two people. And the same for chats. So I think that's one of the really strong points that we have here that we can really quickly zoom in into data that is of communication between two people. Yeah. Absolutely. So for example, I can see I've got one email from miss Acres here, and I can filter down to that one email here. If I come into that document or that specific document, I can very much see where this email is coming from or from or where it's coming to. Now if I read this email, I can see that this actually doesn't relate very much to mister Pritchard despite his PII being here and the term promotion. So whilst I can search for keywords and make them really specific, sometimes we are going to find irrelevant data, and this is equally as important because I still need to tag this as nonresponsive. I need to mark this document as a not responsive document so I don't I know this, a, this document has been looked at, has been reviewed, but b, it does not need to be disclosed to the data subject, so I don't need to waste my time on reviewing. However, if this document was relevant, we now know that I now need to disclose it. And we can see that there is a lot of other PII. There's a huge amount of names and email addresses all throughout this this this this document, and even these squiggly lines is where we're identifying that PII. So I can if I hover over them, I can see, for example, Bridget Andrews' name has been there, her email address. And this is one of the beauties about Logical is not only are we gonna make your document searchable, is we're gonna identify or flag where potential PII lives within this document. And within three clicks from this PII area, I can redact all of those feet, all of those mentions of PII, making this document sanitized and potentially ready to disclose. I can continue to redact for for legal privilege or trade secrets, etcetera. But this has and now we have clients who have said this reduces their redaction review by about 80%. They're going from about five days going from this down to one, which is a huge amount of time within that thirty day response time. Yeah. And I think this is really one of the key features also within Logical related to records requests. We don't see this only for DSARs, but also for freedom of information requests where redaction is also kind of a key element of the process to protect data personal identifiable information. And there are actually quite a lot of items that we support for identification automatically. So not only names, email addresses, but also personal identifiable numbers, phone numbers, quite a long list of data credit card numbers that is supported for automatically detecting that and then performing auto reduction on that. Absolutely, Daniel. And, yeah, it's a huge time saving for our clients. Now we've just seen that this document, unfortunately, is not responsive or fortunately, depending on your your frame of mind. So I'm going to take away that that filter there from the email address. And now I want to start looking maybe into the Slack conversations. Now we know that miss Capula and miss Montague were two of the people mentioned within the actual request itself. So, again, using Logical's easy filters under the participant, I can see those people, and I can identify their their Slacks. Now within Slack within the Slack import, we break documents we break the the chain up into twenty four hour chunks so you can documentize this. And you can see that within the file name here. So I can see the exact date that this is a message of, what the kind of message is, is it a direct message, is it a channel, and then who are the people involved in it. Now if I jump into the document itself, you will see the document and how it's come in. We can see individual messages. I can see reactions. I can see the the time stamp of when the document or the the message was sent, and I can read it with context. Now this document here is clearly about mister Pritchard and his promotion effort as well as other things throughout. This document is clearly responsive to this request. But, again, I can use my PII redaction. I can redact anything that isn't relevant to this request or any messages that aren't relevant just by using either the the draw on redactions or the PII redaction module as we just saw. Daniel, any points from yourself on this, Pat? No. I think he covered it quite well. I think it's, yeah, important to show what we earlier discussed. Really zooming in to communication between two people can make a huge difference in finding the relevant data fast. Yeah. And then, yeah, especially with the auto redaction option of redacting PII the detected PII automatically, that's is really a huge, huge time saver. Absolutely. So once I've identified all my documents, I've made my my my documents clear, and I've re reviewed them. I've tagged them. I now need to prepare them for production. And it's very easy to identify those documents just by using our tag filter. So the ones that were marked as responsive all come back under the responsive area, and I can save this search ready for it to be produced. So this save is gonna help oh, Daniel, go ahead. Yeah. Yeah. You you were talking about production, and I know it's kind of a legal term. So just for understanding, if we talk about production and within eDiscovery, we basically mean that we are going to produce the documents in a certain format and making it ready for export. So that you can hand over data in the formats that you want to have so that it's easy to read for the recipient of the data. Absolutely. So saving that search means I can move into my downloads area. And here we have a very simple wizard. I'm gonna flick through these. Again, this is my search, ready to review, also ready to produce. I can templatize this. I can include family members. I can include a load file, which is not normally typical within a DSAR because I just want to produce the documents. But this is more for litigation or regulatory responses, again, handled incredibly well within Logical. I want to produce image documents or image versions of the documents where necessary or where appropriate. And I'm gonna say I want those in PDF version. And I also want to include the redactions. Yeah. Sorry to interrupt. But maybe good to understand that we kind of normalize the documents all into PDF. So whether it was chat, whether it was an email, whether it was a Word document or a PowerPoint, all documents will be normalized to PDF to make sure that you have one single format that you can distribute to the recipient. Yeah, absolutely. And a really key point to mention, actually, is I forgot to mention about this warning. This warning is identifying or flagging to me that there are potentially privileged documents within this set. Now this means, for me, am I aware of that? Do I know that they've been appropriately redacted? Should they be there? We're just gonna flag this for you at on the get go so you don't have any scares come already producing those documents. We want to be proactive in that step. And that also follows through into when I'm ready just before I want to go, like in Amazon or any other web shopping area. We're gonna show you what's in your basket or what's within your production. Right? My download summary gives me a a breakdown of the tags I put onto those documents, the number with redactions, how many have been tagged, how many have comments, how many have other things so that you are aware what is actually coming out of the system and go into the data subject. Now I can do one of two things. I can either press create download, and this will create my download. It will create a zip file ready to share with the data subject via whatever means you want to do. Now sometimes that can be a problem because the data subject may just be on a standard Google or Gmail or Hotmail or AOL, if anyone's still using those, account and which all have limits on how much data can arrive or maybe links, maybe spam filters and stuff like that. Or you could alternatively do an FTP site or secure SFTP sites, which again involves administering passwords and and the data subject forgetting the password. That's why we've come up with our secure share. Now our secure share lets you email directly to someone, and here we have Steve's email address. That's the keyword. We have Steve's email address ready to go. I can just add a message, and they will receive an email with that message and a link to go into Logical. Now this link will enable them to download the the package directly onto their own computer. So no no more do you have to faff around with with delivery systems. I can easily let them do it. And most importantly, is you're gonna receive an email when they've accessed the file for the first time. Meaning, you're now gonna have an auditable trail to say, we completed our DSAR. The DSAR was downloaded by by the data subject within the time frame set by the ICO. You can store this with your compliance and privacy team, and you now know that that is completely auditable. And now you can move to potentially either archiving or closing down this case pending only for future action. Yeah. And this is, of course, really simple to do. Right? Like you mentioned, instead of emailing or putting data on an FTP site, you just send a link to the requester, and they can download download the data whenever they want and where they want. Absolutely. And that's it. That's the beauty of Logical. So no longer DSARs are horrible mumps. They can now become part of your business process and become very easy to handle internally hopefully don't make your life as painful as they currently do. Yeah. I think thank thank you very much for the demo, Andrew. I think it was very clear demo. So we started off with the letter, the DSA request. Then we created a project in Logical. We uploaded the data of the that is requested. We did some filtering to get to the data we really needed because we collect the data a little bit broad. We refute the data, very important. We also redacted the data. And once we're done with that, we send out a link to the requester so that they can download their DSA request. Sounds really simple to me. Hopefully, is. Yep. So I know that we have quite a few clients that are using leveraging Logical for this use case besides all the use cases like internal investigations, public records requests, and also eDiscovery related cases. But I think that these are typically one that is being adopted a lot also by HR departments and legal teams within corporations, preventing them to outsource, let's say these kinds of services, which is also saving them a lot of time and money. So I would like to thank everybody for joining today, and I hope everybody enjoys the rest of their day. And thanks, Andrew, again for your demo. Thank you, Daniel. Pleasure to talk to you guys today. Almost forgot that we have a few questions, Andrew. So maybe we can go over these. Yes. Let's start with the first one. I've got your question about Slack and Teams connectors. And the question is when you're pulling data for a specific custodian, how does Logical handle group or channel messages where multiple people are involved? And do you end up collecting third party data you didn't intend to? Yeah. It's a really interesting question and concept around how I can collect from a specific user. Now what we're gonna do is we're gonna pull what that user essentially can see. If they're part of general channels, other channels, we will pull that channel in. Now you also have the ability to limit or control what part of their Slack or Teams universe you can bring in. So you can choose to not have external channels. You can choose to not have private channels. You can excuse to not have third party channels. You can only focus in on the specific DMs within the within the Slack universe. So it's all down to you, but we're giving you the controls to either collect or not collect. Okay. Thanks. I think that's very clear. Another question that I see is does logical deduplicate globally across all the custodians or is it per custodian? Yeah. How do. we handle that? Yeah. So you have the ability to do either. So on the project setup, you can select which deduplication view you want to see, whether that's a global deduplication or a custodial deduplication, and you can also switch between it during the case. So it's actually dynamically that you can set that as a user. So you don't have to preset that when you do the collection or upfront in your project settings. It's something that you can change whenever you want. So you can choose during your review for global deduplication or switch to custodian as all duplicates are in the system as well. Yep. So we don't, let's say, remove data from the dataset because there are duplicates. Correct. It's just limiting your view and your searchable review content. Okay. Another question about when a DSAR has the potential to become employment litigation down the road. At what point do you recommend separating the DSR matter from the litigation matter in logical, or do you keep them in the same project? I think the latter. Right? Yeah. I I I think a lot of what this is is how you want to deal with your matter. Logical gives you the options to do either. So I can run my, and I can if I know it's going to involve or potentially go into an employment tribunal or any other legal action afterwards, I can be very mindful of this and invite my legal team early. We also have copy to functions. So if I were to only want my legal team to look at the disclosed material only, I can copy to a brand new project, a subset, or the whole project if they legal team want their own workspace. So, again, we're giving you the flexibility to perform actions and perform work how you want to do it as opposed to prescribing a certain way. But my personal preference is you all have you all work within the same project that's been collected from your own data. Yeah. Which will also save time processing data and, yeah, only have one instance of the the data indeed. And then one other question, the last question that I see is does logical support automated PAI detection for third party data like other employee names or contacts details that need to be redacted before production? I. think we partially covered that. Right? So we can work with lists, but And and and, Daniel, as you're fully aware, being the person who's who's who's promoting this and leading this, we are always adding new, PII types, as well that can be, automatically detected and therefore redacted. Yeah. And just to, we, of course, do, let's say, the automated name detection. So we extract detect the names and we can redact those. And those are not specifically targeted at, let's say, the subject of the DSAR, but can be any name basically. And besides that, we also includes detection and then also redaction of contact details like phone numbers, email addresses, but also standard addresses, street name, city name, but also Social Security numbers, health numbers. Those that type of information is all part of the automated detection and redaction process. So, yeah, that's all in there. All right. I think that was the last question. So thanks again everybody for staying with us today. And I hope you enjoy the rest of your day. Thanks, Andrew, for all the answers that you gave. and your presentation. Cheers. Bye. Bye bye.