Rise of the Machines: Why Knowing Your Data is Vital

Machine learning is becoming increasingly mainstream. In fact, the number of enterprise machine learning pilots and deployments are forecast to double this year. Well specified machine learning models can add considerable value and help companies make faster and more informed decisions – from tracking and profiling customer behaviour to offer them the most suitable products, to market analysis or calculating risk. To really reap the perks of machine learning though, you need to know your data.

Konrad Semsch, Senior Risk Analytics Manager at Spotcap, discusses the benefits of data science and why it is so important to strike the right balance between man and machine.

Why is data science so important and how do you maximise the benefits?

At Spotcap, data science is a central part of our businessit feeds into our product, underwriting and IT teams, and into business development. We have built an internal R language library to optimise data science tasks and share knowledge across the company. Essentially, this tool set contains a package of functions which can be used to prepare and analyse any new data set. This optimises the results of our analysis and reduces the time to complete the task.

R is a programming language supported by the R Foundation for Statistical Computing. It is used by statisticians and data miners for developing statistical software and data analysis.

How does your algorithm continue to improve?

We regularly enhance our technology and retrain our models based on algorithms such as: Xgboost, Elastic-Net or Random Forest, to respond to the latest data and market changes. In addition, we continuously increase the quality and quantity of our data that feeds into our models. Making efficient and robust assessments is crucial, so we keep on improving our risk attribution and scoring models. Although we are working toward more automated decisions, input from our skilled underwriters and data scientists will always be imperative to achieve the best results.

Do machines make better analysts than humans?

Machine learning algorithms can process vast amounts of data much more efficiently than humans. The introduction of high-performing algorithms has led to great advancements in machine learning. However, nothing has greater impact on the accuracy of your models than feature engineering and the quality of your data.

Algorithms need to be constantly refined, data has to be verified – all this requires human input. The better your understanding of your data, the more accurate and insightful your results. Even the most powerful machine learning algorithm will fail if applied to data with measurement error.

Feature engineering is the process of transforming raw data into features that better represent the underlying problem to the predictive models, resulting in improved model accuracy on unseen data.

How do you strike the right balance between the man and the machine?

It’s all about automating the right parts of your analysis and remembering that human interaction is important at every stage of the model life cycle. At Spotcap, we combine advanced data analysis, machine learning and human analytical skills. Because we’re dealing with real people and real businesses which are by nature complex, there will always be a role for the human mind to play. Our underwriters and data scientists continuously add new knowledge and risk drivers to our models to get even more precise results. Human expertise combined with advanced technology enables us to make accurate, yet flexible credit decisions within one day.

What’s in it for the customer?

We make finance more accessible by focusing on the real-time performance of a business using of-the-minute data. In contrast, most traditional players have been relying on historical credit data. Any red-flag in this dataperhaps something in an outdated credit file, or an old tax return—usually means that a credit application is turned down, effectively locking businesses out of the finance market. Those able to access finance often have to wait several weeks from application to approval before receiving the funds. We give customers a credit decision and access to funding within 24 hours.

This is possible thanks to our proprietary credit algorithm which uses mainly numerical bank account data, amongst other input, and blends in different text mining techniques to learn from transactions descriptions. We think bank account transactional data is one of the strongest sources of predictive information for short-term lending and risk mitigation. Our algorithm enables us to make accurate, yet swift credit decisions effectively solving the two historical problems, access and speed, and providing a better experience to our business customers.

Data quality is crucial as any analytics is only as good as the data that feeds it. To maximise the benefits of technological progress human input remains essential.

Interested in joining our team?

We are currently looking for a data scientist, based in Berlin. If you have what it takes, we want to hear from you. Send us your CV or get in touch to discuss your questions: careers@spotcap.com.