A Data Scientist’s take on Process Improvement

Two terms that rarely stand together: data analytics and process improvement. I started my career in project management and process improvement (Lean), and in my time as management consultant and data scientist, I have found myself multiple times in peculiar situations. The standard scenario is that our client is trying to solve problems on a company level, but their departments are all working in silos. The need for cross-department collaboration is evident, and sometimes it feels as if we are spending more time on trying to collaborate rather than solving the issue at hand. I assume that this might not be something new to the reader, and the task of breaking the silos and getting a functional cross-department collaboration can prove to be quite the challenge. Today I want to shine light on the wonderful world of lean and process improvement, and show how the analytical skills of a data scientist can prove to be a powerful complementary skillset.

“Process improvement involves the business practice of identifying, analyzing and improving existing business processes to optimize performance, meet best practice standards or simply improve quality and the user experience for customers and end-users.” – CIO, What is Process Improvement? [1]

How our clients usually work with process improvement

In order to save text, I will simplify how companies work with process improvement. Often you find a core group of employees who are tasked to work with analyzing the existing processes of a company, both business processes and internal ones, and then come up with improvements to either reduce waste, save time, cut cost, increase revenue or increase customer/employee satisfaction. As the pace of digitalization increased, the lean teams started to see the need for new or upgraded IT tools, which required the companies’ IT departments to be involved. This is where we tend to see difficulties in collaboration between departments. On one side, we have the lean team who sees the need for new IT tools, and on the other side, we have an IT department that is being measured on their cost savings and operational efficiency. Investments in new tools, for example Robotic Process Automation (RPA), really put the spotlight on the problem. Every RPA project I have seen has been a struggle to convince IT departments and business stakeholders, that the best way to improve a process is to buy a completely new tool (increase IT spend) and then “hope” for the long term cost savings of improved processes to be greater than the increased IT cost. Here the reader might think, “Well, why don’t you just create a business case?” to which I reply “sure, but do you have actually have the data to put together a feasible business case?” Here I suggest bringing in a data scientist who can support you by asking the right questions.

The role of a data scientist in process improvement

Do you have any data on the process? What does the data actually tell you? Is it feasible to believe that the process improvement will give long-term effects, or can we improve the process in another way? Everyone is talking about becoming data-driven, and I would argue that looking at your own processes is a great starting point. I know that some of you think of artificial intelligence when I say data scientist, but for most of the time, we use our base skills of statistics, data gathering and data visualization. In situations where you do not have any data from your process, and instead rely on how the process “feels”, a data scientist can help you gather and structure the data you need to do a proper analysis. In the situation where you know there is plenty of data, but you have a hard time compiling it into something useful, a data scientist can help structure and present the data for you. For this to work, you need a functioning collaboration between the business users who have the domain knowledge, the IT team who know where the systems that generate data reside, and the data scientists who is there to help you structure and make sense of it all. Best-case scenario is that you end up with an even more rigorous analysis of your process, and a business case that might be backed by real data and statistics! To cement my point even further, I will present a client case I worked on a couple of years ago.

Example: Rerouting of invoices

My client had a staff of approximately 300 employees who worked with providing services to thousands of customers. Given the vast amount of different services provided, the client’s organization is divided into several departments, and each department handles their own invoicing (incoming and outgoing). Because the company did not clarify, which department was providing services for the customer, invoices sent to the company often ended up in the wrong department. This happened on a daily basis and accounting assistants of the departments would look at their incoming invoices and recognize that they belong to another department and need to be rerouted. The process could have ended here with the assistant simply forwarding the invoice to the correct department, however, company rules specify that erroneous invoices should be sent to and dealt with by the central economy department. This meant that for every incorrectly sent invoice, there was now a need for a second person to look at the invoice and double-check that it indeed should be rerouted, and then reroute the invoice to a third person in the correct department.

This overhead of the central economy assistant quickly became a problem, and the company saw a need to employ additional staff and refocus existing economy assistants to this rerouting of invoices work. This is where the process improvement team of the company had stepped in and identified an opportunity for RPA. The company was going to use a software robot to follow all the rules of invoice rerouting and reroute them automatically. Here my team stepped in and told the client that implementing a software robot to take over the work of the central economy assistants will free up their time, but we are not solving the root problem of incorrectly sent invoices. Additionally, the robot will only free up time for the central economy assistants, while the other departments still have to work with rerouting their invoices to central finance department every time they discover an incorrectly sent invoice.

I asked if they had data on how many invoices were handled this way every month, but they did not know any exact figures, only that it was enough to keep two assistants occupied a couple of hours per day. I continued to ask about the invoices, I was curious about who was sending the invoices, which departments receive them, and where are they later rerouted? There was no data on this, and we decided that to solve the problem long-term we would work together with the IT team to implement the RPA robot and have it collect data about the invoices in a local database. We gathered information such as who sent the invoice, where was it sent, where was it rerouted to, and what date was the invoice handled. Two weeks after launch, we had already gathered a decent amount of data on the invoices, and we could quickly identify the culprits in the process.

Figure 1: Amount of incorrectly sent invoices per vendor

As we can see in Figure 1, certain vendors showed up more frequently than others did, pointing towards a systematic error done on the vendor side. With the newly acquired data, we could assist the economy departments with more targeted measures such as contacting specific vendors to see why this happened. With this information, our client could contact their vendors and have a fact-based discussion on incorrectly sent invoices and work together to reduce the numbers.

Figure 2: A network graph showing invoice routing between departments

Our data could also be used to identify if there were any other issues in the invoicing process. By plotting a network graph over the affected departments, as seen in Figure 2, we discovered an almost cluster-like dependency between departments. What this image told us was that the invoices are not ending up “randomly” at the departments, but instead we saw how certain departments were closer connected. One example was that the plumbing & ventilation department often saw invoices related to electricity services, while healthcare department never received one from there. In the graph, the node “E” represented the central administrative department, and they had connections to almost all departments. The hypothesis was that whenever a vendor was in doubt, they sent the invoice to the central department. After we finished the project, the client had the tools to analyze rerouted invoices, and could continue working on reducing their numbers.

This project was a successful example were we managed to bring IT, process improvement and the business team together and solve the issue of incorrectly sent invoices. If you would like to know more how data analytics and process improvement go hand-in-hand, reach out to me and my team at Sopra Steria.



[1] https://www.cio.com/article/3433946/what-is-process-improvement-a-business-methodology-for-efficiency-and-productivity.html

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.

Up ↑

%d bloggers like this: