Watch this session to learn how Expert.ai and Reveal Group are directly address four of the biggest challenges to adopting NLP.
Transcript:
Luca Scagliarini: Hello. Happy New Year. It’s the first episode of the New Year of NLP Stream. Today we’re going to talk about a topic that is, I think, highly relevant for many people. The potential of expanding the scope of application of RPA and adding the possibility to understand complex documents through natural language processing is very promising in terms of increasing the return on investment on automation. And today we’re going to have Josh Noble and Doug Merrill of the Reveal Group. They’re going to talk in a very pragmatic way about the key challenges, what needs to be considered to put these things in production to generate business value. And so as their topic and their knowledge is much more than mine on the topic, I’ll let them take over for now and cover for the next 25, 30 minutes talking about this topic. Thank you.
Josh Noble: Sounds good. Appreciate it. Well, so I guess to start, we’ll pass the buck a little on to… All the knowledge here is actually residing over with Doug. I’m just going to play the interim role on here, but we’ll get things teed up and we’ll dialogue through it. So I appreciate everybody tuning in today. I guess any livestream webinar or what have you around NLP, document understanding, anything that gets into the cognitive space, we have to start with the mandatory obligatory stat of 80% of data in the world is unstructured. And this is true. You will see it’ll cost quite a few industries out there. There was a UPMC study we cited in the video we put out on Reveal Group through our social media feeds a few weeks ago that was getting into 80% of patient data in healthcare in the US is unstructured. So we see that trend a lot out there.
But I’d also say that surprisingly, despite all of that unstructured data, I would say that the NLP market hasn’t taken hold historically as much as I would think it should be. I mean, we see it in our daily lives, and you see NLP when we’re looking at Siri and various different applications out there, but I put this more in the business context of using NLP widespread across our organizations. I was trying to get a little bit of a pulse on the market of where the NLP market compares to everybody else, and I was reminded why I can’t ever trust analyst market reports on there, because I was seeing stats from the NLP market being somewhere in the range of, I think this was 2021, but somewhere in the range of $4.2 billion to $8 billion. So that’s a pretty big gap in the assessment out there. But one way or another, I would think that it’d be larger than what we are actually seeing in reality.
Of course, NLP is all over our social media feeds right now. Everybody and their mother is obsessed with GPT-3 right now. So that’s got things in people’s mind. And I love the idea of GPT-3 stirring these ideas up. But interesting enough, this livestream is probably the only thing on the NLP space this week that’s not actually about GPT-3 and the latest social media obsessions. In fact, this stream is more around why NLP tools like GPT-3, IBM Watson, AWS Comprehend, Google Cloud’s natural language processing, Azure LUIS, the list goes on, but why these tools, while they are well-used and abundant, are not as widespread within business organizations as we would expect.
The why on that, if you want the Cliff Notes cheat sheet, is already up on the screen for you, but we’re going to go into that in a bit more detail. And if you were thinking about using natural language processing, natural language understanding for a business case or maybe a VC fund getting pitched the latest buzzword soup to invest in the latest AI startup out there, then you’ll probably see a lot of pitches for an approach to NLP that historically, again, haven’t really taken off and solved the world’s problems. We get down into, again, deeper business problems.
And the starting point is expectation setting. Historically, there’s two camps that I’ve observed in the NLP space, and it comes down to the initial sales pitch. One is historically that NLP was only something that you can get into and solve if you have data scientists. More nouveau is that NLP is quick and easy. We’ve seen how that’s played out in the RPA market in the pitch there. So for example, the sales pitch nowadays tends to be something like, “Our next generation AI hyper-unicorn runs on a SaaS cloud platform and operates auto-generated AI models and doesn’t need any manual training.” But again, that doesn’t seem to play out too well. And in fact it gets a little further into, “It’s included with your cloud subscription service, and you don’t need to have people manually connect any sort of wires.”
I would say that the actual approach that we’re seeing to NLP in success within business operations flips that entire model on its head. It changes around how we look at architecture. It changes around the role of subject-matter experts. It changes the role of do we need thousands and thousands of documents to train a model, or can we do it with dozens or hundreds of documents? It’s a very different approach that blends a bit of the nouveau AI angle of things with a more historic approach, I’m getting a message here, where we would take a bit more of a manually connect the wires type of approach. So let’s get into that in a little bit more detail. And I’m hoping those messages that I have come through mean that you actually still were able to hear me through all of that, but we’ll pass it over to Doug to dig into the details a little bit more.
Doug Merrill: All righty. Thank you, Josh. So this brings us to-
Josh Noble: Cool. So everybody could hear me.
Doug Merrill: All right. Hopefully, you can all hear me.
Josh Noble: Yeah, you’re good.
Doug Merrill: So, yeah, that brings us now to what should be really considered when it comes to NLP tool selection and bringing that into an RPA program or just in general. And first-
Josh Noble: Doug, real quick before we get into it, I’m currently seeing the slideshow mode. If you’re able to pull that up in full screen on there. That’s what I was getting messaged. But, Sue, sorry, I wasn’t the one running the slideware on there.
Doug Merrill: Okay, let me update that.
Josh Noble: You got one monitor in there, Doug, or the 15-
Doug Merrill: Yeah.
Josh Noble: … that I got going on in front of me?
Doug Merrill: All right. Let me just share the entire screen here. Is that better?
Josh Noble: Can’t see anything quite yet.
Luca Scagliarini: No. I’m-
Josh Noble: I may be-
Luca Scagliarini: It’s gone on. Yeah.
Josh Noble: Yes. That works. That works.
Doug Merrill: All right. Super. Yeah. So that brings us to our number one consideration by far when it comes to NLP tool selection is around infrastructure requirements. A lot of times this is often the biggest hurdle when it comes to onboarding an NLP tool. So the first question that’s always asked is, “What are the deployment options available for that NLP tool?” And oftentimes we see those SaaS solutions and even the private cloud, but the on-premises version of NLP tools is hard to come by. And then even if it is available on-premises, sometimes that still requires GPUs, and that’s often a hard thing for companies to provision.
So definitely having a conversation with your own IT department in terms of what would be preferred when it comes to onboarding an NLP tool is going to be good to have in terms of providing that requirement to the NLP tool vendor, because we definitely see quite a mix in the NLP tool space around what is available, with the most common being the private cloud or SaaS solutions. But often the most common need is the on-premises solution. So there’s sometimes that difficulty to traverse, but we are seeing more and more tools that are available on-premises and that don’t require GPUs.
The next consideration is around just modeling preference. And what I mean by that is understanding in your organization what is the primary value driver in terms of are you wanting a tool that is easy to use and it’s a little more locked down and subject to whatever the vendor provides you with? Or would you want to be able to have that flexibility and also ultimately have a more explainable model? And so when I say flexibility, that often means are you able to adjust the confidence levels associated with each label and entity being extracted? And oftentimes that means being able to trade off precision and recall. And some tools do not allow that, but other tools do. So it’s making sure that you’re able to do that if that’s ultimately what you need.
And the other part of this too is the explainability aspect of it. Machine learning, as we know, is definitely a bit of a black box. And so not only does it mean we can’t tweak too many things with the machine learning model and the parameters that are behind the scenes, but we also can’t understand the output as much because we’ll see it produce a label or entity, but we don’t actually know how it made that decision ultimately. And in certain use cases that explainability isn’t as essential, but there are other use cases where if you get a result, you need to know how it came to be, usually from a regulatory consideration. So that’s definitely something that needs to be considered in terms of how you select a tool is which do you prefer, the flexibility or the ease of use?
Josh Noble: Doug, maybe I’ll pause you here because you and the team wrote a great white paper, I think was it last Thanksgiving-ish timeframe at this point? It’s almost a year ago, it says. It was around accuracy doesn’t matter. We see all sorts of NLP vendors that are going and saying, “We are the most accurate thing out there.” And again, great white paper, read if you’re interested, a little fun, but it’s that balance of precision and recall that you hit on there. That balance that you’ll be able to move between the two. For folks that are not way deep down indoctrinated in the NLP side of things, could you maybe explain what’s the difference between precision and recall? I could hop in there if you want.
Doug Merrill: Yeah, sure. Yeah. So precision, so just having a super accurate model isn’t necessarily what you need. Sometimes you want to make sure that when you say, “A label should be applied here,” that it’s 100% correct, there’s no chance that it’s being misapplied. And that’s what we mean by precision is just making sure that when we say something is X, it’s always X. And that’s really important for something like extracting account information, or a sales order, or saying something’s a sales order and that triggering a process down the road. You don’t want to accidentally start a process that should have never been started because of a label that was assigned. So that’s a situation where you would prioritize precision.
Josh Noble: So precision, one of the examples I like out there is anywhere where a false alarm is more costly than an overlooked case. So for example, you’re issuing loans. You do not want to issue loans to somebody who clearly can’t repay that loan, and there’s a lower cost risk to the organization of missing a potential customer. So we want to be very, very sure before we issue a loan that person has a pretty high probability to pay things back. So in that case, we’d want that precision to be very high. We want to be that type of accurate, I guess, if we want to go and put it that way. And then recall’s the opposite approach of that, if you want to go through that.
Doug Merrill: Yeah. Recall, I think the best example to look at is when you think about spam folder in your inbox, pretty much, you want to be able to capture the whole universe of spam as opposed to have something that’s always 100% correct when it’s being identified as spam. So you want to just be able to catch everything possible because there’s a low chance or at least a low risk of misidentifying something as spam.
Josh Noble: So for example, you don’t want to necessarily… Well, I’ll take a different angle on there. There’s a really low cost to retain, say, duplicate legal documents. There’s a really high cost to delete the only legal record that you have on file there. Or in the case of spam, we’d rather not send as much to spam if there’s a risk of, say, an audit request that comes through. We want to make sure that definitely doesn’t get missed and go to spam. So we take a broader swath of options that we leave in the inbox versus pushing it out.
Doug Merrill: Yeah.
Josh Noble: Cool. All right. Sorry. Didn’t want to interrupt you too much on that piece.
Doug Merrill: Oh, no. No worries. Yeah.
Josh Noble: Cool.
Doug Merrill: So now the next consideration around NLP tool selection is with just the level of unstructuredness, if you will, to the unstructured data. So I’ve got a bit of a continuum here in terms of how we think about how complex unstructured data is in an organization, and that’ll often determine whether or not machine learning by itself is capable of handling that unstructured data versus where other tools might be required in addition to machine learning. So when we think about emails and free text fields, those are generally not very complicated in terms of the content that’s in there, nor are they very long-form. And as a result, this also, it depends, if there aren’t too many labels or entities being extracted, then you also, as a result, don’t need so much training data, and as a result, machine learning is going to be a viable option there.
However, as the use case increases in complexity as there’s more subject matter expertise that’s required, more contextual understanding that’s required, that’s where machine learning is going to start to break down in terms of being able to handle those use cases because you’re simply just not going to have enough training data to accommodate all of that complexity out there where there’s just so many different ways to say the same result. You’re going to need something that also is able to have either an underlying knowledge model or a rules-based approach, something that’s just able to have more of a semantic understanding of what even words mean relative to one another to extract those labels and entities with a high degree of precision and recall.
Josh Noble: So AI model, when we think of things that are broad topics, macro survey of political climate that’s out there, maybe broad email classification use cases out there would be more on the AI side of things because it’d be pretty common across everybody out there. Whereas, okay, I’m using a lot of abbreviations within my organization, or there’s five different ways we talk about that same medical term and needing to map all of that that out really nuanced to the way you produce widgets or you do work, then we want more subject-matter experts in the mix that are mapping out those things versus just trusting that the black box AI model is going to know how our organization works and spit out the right result.
Doug Merrill: Yeah. And then I would also add to that when we think about like, “Oh, if I were to use a hybrid NL, would that be sufficient for all these other use cases?” And the answer is yes, that would provide you a one-size-fits-all platform to handle all of your use cases. Whereas, machine learning just doesn’t work the other way where you reach a point where you can’t work with any more use cases beyond the very simple free text field and email. So it’s also just if you have those more complicated use cases and you want to stick to a single tool, then you have to also consider having something that’s not just machine learning reliant.
Josh Noble: Well, so a good example of that that we’ve seen is where you might have ingested a document through an IDP solution. So we’ve digitized it. Now we need to actually understand the content within that document. And the first approach may be, “Hey, we want to understand the names that are within this document, phone numbers, addresses, those kind of broad topics.” So we could take an AI-based approach to get that first scrub of content out of that document. And then we have a whole lot of different documents and we want to understand if there are conflicting reports, conflicting information across those. So maybe fraud’s being reported. This person said they broke their leg but they were actually on a flight at that time on the other side of the country. When we get into that type of deal, we could start to apply the mix of we’ve gone from the AI-based approach for broad into subject matter expertise, connecting the wire on those nuances, and do it all in the same platform.
Doug Merrill: Yeah. Okay. And that brings us to our final consideration here, which is just around the ones that the CFOs and the heads of automation really care about, and that’s the target payback and ROI for an NLP tool that’s being picked out. So by far in our experience working with various clients, we talk about what is the payback period in terms of, “Okay, when do I actually break even and everything after this is just gravy?” And we won’t green light a project if it doesn’t have that six to 12 month payback window. Because if it’s too long, things can change, and you want to be able to realize that return quickly, but it’s also because all these tools have set up work. If you see anything shorter than six months, it’s a little bit suspicious.
So, yeah, it’s being realistic about your expectations around payback and also being able to understand what’s your max, what’s your budget for an NLP tool? Because that’s something that’s hard to really think about unless you calculate that out. We wrote a white paper last year that’s quite a few pages that explores how you actually calculate that benefit and that ROI and payback. And now we have a calculator, which, as on this slide, shows a couple of different inputs that are used for that calculation to determine all of that great benefit and then ultimately figure out what your budget is for NLP tool spend. And that’ll also just help you go into these conversations with the potential NLP tool vendors with a little more of an idea as to how much is possible to use, because there are so many tools out there like the GPT-3s and the Watsons, which are very expensive but, depending on what your organization needs, aren’t maybe as essential.
Josh Noble: I think probably a critical one that that’s not on the visual right here is, it gets into it in the longer deep dive analysis that I’m sure we can share around somehow, but the hardware costs. I mean, it was first part number one that you had mentioned as far as down selecting into the right NLP tools. If it’s only SaaS-based, we’re locked into one sort of model. If it’s only on-premise and it’s highly GPU intensive and we have to stack up tens of thousands of dollars worth of GPUs to be able to manage to get through it, that’s going to wildly change your cost calculation on there.
Having an approach, which we’re on with Expert.ai right now, and so having an approach that Expert.ai takes where you can have it on-premise, you can have it in a private cloud, you can have it assessed, you can put it in containers and just run the model, and having that flexibility is big for organizations to be able to be dynamic in how they go about it. Plus, not being GPU intensive is probably one of the biggest ones out there where I don’t see a whole lot of NLP capabilities that are pretty low on the GPU totem pole side of things. So not to say you don’t need a graphics card. You still need to be able to render things on a screen, but you don’t need the big boys. Cool.
Doug Merrill: Yeah. All right. Yeah. I mean, that wraps it up for us in terms of what those considerations are. I mean, the infrastructure requirements, your modeling preference, unstructured data inputs and that target payback and ROI. So certainly open up the floor to questions here, but I think this gives you enough to think about in terms of how you would even approach looking at NLP tools going forward.
Josh Noble: Doug, while we go through those questions, I know the slide’s very similar, could you toggle back to slide two? And we’ll leave that up on the screen. That’s the one that we weren’t showing when it was going through some things at the beginning. Again, a summary bit of some of these topics as well. Cool. Yeah, there we go.
Luca Scagliarini: Yeah, I think that one of questions that naturally it’s important to understand is what is the status in terms of the customer, how the customers, how enterprises or even midsize organization are aligning with these four points? I mean, do you find that there is an understanding? Do you find that people, I don’t know, are afraid and they tend to be very conservative? What is the experience from the field? Because that’s what’s really interesting to understand.
Doug Merrill: Yeah. I mean, I can start answering that one. Yeah. I mean, there’s definitely, as you said, a bit more conservatism around just what the solution is and how it should exist in their organization. So specifically with the infrastructure burden, they want something that’s on-premises because that unstructured data is often dealing with very sensitive, confidential information. And private cloud is starting to become more and more of a conversation just because there’s some limitations on the on-premises side, but if one was available and it didn’t have GPUs then that’s definitely a huge preference for organizations.
With the model flexibility, there’s flexibility around that conversation. Some people want something that doesn’t have tons of bells and whistles, just mainly machine learning. Other organizations want something more explainable. So there’s a little more variation there. And then with the complexity of the underlying data, that’s completely driven by their use cases. So again, a little more flexibility there. Sometimes there isn’t a ton of complexity associated with that underlying data. So machine learning is more than capable of handling that, but then other times the more complex the use case, and clients have already tested this out in many ways, the more they see that machine learning doesn’t work so well.
Josh Noble: I think we saw a trend, if I go back three or four years ago, where a lot of companies were trying to get it into the NLP space and taking very much of that AI approach and had been hitting their head against the wall trying to solve some nuanced problems within their organization that has nuanced language in there and just thought, “Well, the standard way of going about this is grab a ton of documents, grab a ton of samples, throw that through the AI model, and that’s just how things are done.”
Over the past year we’ve seen a lot more organizations raise their hand and say, “We’ve been down and tried to solve that in that pathway. What’s a different way to go about that?” And it’s been very exciting to be able to get in there and solve problems that they’ve looked at for the last three or four years and haven’t been able to tackle. So that one’s definitely been fun. Obviously, I think over the next year or so we’re going to see a lot more organizations that are thinking more and more about NLP just simply due to the social buzz around GPT-3. Now, again, you then have to get into the details of… Because GPT-3 basically hits every one of these as far as-
Luca Scagliarini: Exactly, and that’s the elephant in the room, right? Now everybody talks about Chat GPT-3. And I saw that you published a video around how to use it with Expert.ai, but the two technologies are somehow put in the same bucket, if you want. What are the differences and why they can work together? I think that would be probably useful. I think there is a lot of education to do on that. So if we can start today with saying something around it, it would be useful.
Josh Noble: So interesting enough, they are polar opposites of each other, talking cloud-based only model and flexibility from the point of it’s pre-trained completely. You can’t even keep it up-to-date yourself. You can teach it to be a little bit better around, “We prefer these types of outputs versus these ones,” but you can’t train it on new abbreviations, for example. So wildly different on that compared to the flexibility you have with something like an Expert.ai. Where we’ve been using it is actually not in GPT-3 Chat. It’s been GPT-3. Well, we started doing this about a year ago. One of the biggest problems we had on spinning up proof of technologies, POCs, that name and acronym and that area, was getting to data sets to begin with.
When you’re first starting out, going to a business and saying, “Okay. Dump all this sensitive data on us, and then we’ll prove out that just the base of this technology works,” is a pretty sticky situation to get into. So what we’ll go and do is we’ll generate a massive dataset using GPT-3 and use that as an initial model to show, “Hey, you’ve told us this is what the topic is about. We’ve had a randomized sample of communications around this. Let’s show you that using that as a target set for Expert.ai, what this can potentially do.” As you start to move into, “Cool. We’ve got sign off on that. We want to move forward and really see what that’s going to do with our data,” we drop the GPT-3 test set and get into their actual data set and move forward. But is a great way to remove initial barriers to get started with.
Luca Scagliarini: Yeah. Which is at the core is that GPT-3 is very useful in the generation, like generation of content, in this case almost synthetic data from any point of view, while when you actually operate, you need to work on existing data and you need to understand them and really go in depth to understand relationship and so forth. So the two things can really be used together or for different tasks. One can be used in the generation part. One can be used in the analysis.
Now, one last question is around, we talked about hybrid, where hybrid is, at the end of the day a combination of different techniques. The two, if you want, extremes are the rule-based versus the machine learning-based. Are there, in your experience, situations in which one technique works better than the other and could be a guideline on how to implement practical implementation?
Josh Noble: Doug, I hit on that in one kind of angle of things. Why don’t you wrap us up and bring it home with covering that for Luca?
Doug Merrill: Yeah. I mean, I would say where the hybrid NL approach is best is when your use case is that very complex or that a lot of domain expertise is required to understand what exactly the communication is. Whereas, machine learning is very good when it’s a very basic use case. Often, I think of emails and free text fields where just the universe of possibilities is actually finite as opposed to in the medical realm or the legal realm where it’s just the universe is much, much larger. So it’s understanding the complexity of the use case will inform whether or not one model is best.
Luca Scagliarini: Yeah. I’m adding one point here, which is around the problem of data scarcity, right? Often, when you actually move into an organization, you find out that even if they’re dealing with a lot of data, then when you actually go down and try to see how much of this data can be available, how much work needs to be done to be made available, you realize that creating… Often, you don’t have enough data and in other circumstances the work required to prepare the data is so significant and requires the involvement of subject-matter expert that makes it very difficult to consider the machine learning independently from the performance really in that specific element is an element to be considered, let’s say.
Doug Merrill: Yeah, that’s an excellent point. I see it where it’s like once you’ve built that model, let’s say you primarily do it rules-based, then once you’ve deployed production and you need to keep this model up-to-date, that’s where the machine learning is actually really powerful to continue to incrementally improve it.
Luca Scagliarini: Yeah, yeah. Okay. I think that with this, actually, we went a few minutes after our end. Any last word that you would like to say to our audience?
Josh Noble: Appreciate it, Luca, for having us on. Again, we mentioned a couple different assets as we were going through here, the ROI calculation white paper, the accuracy doesn’t matter thing. Maybe Suzanne and team, we could probably bundle that up and get that out on the top of Reveal’s LinkedIn feed so it’s easy to find that all in one place and share that around. But, yeah, we again appreciate you hosting us here and look forward to doing more work together in 2023.
Luca Scagliarini: All right. Thank you. Thank you very much. And we continue this series of NLP Streams events with regular cadence, so we will publish the cadence of the next one. And thank you very much for attending. Thank you, Doug, thank you, Josh, and bye-bye.
Doug Merrill: All right. Thank you, Luca. Bye.
Transcript coming soon