Leveraging technology for compliance.

6 min readApr 24, 2020

As a tech guy, almost one year with in a reg-tech startup (DXCompliance ) felt like writing about the tech involved or which can be involved in this sector. This is my first blog on medium hope I write more going along.

Innovation and technology are empowering influences for powerful and effective startups. With this as driving force we always thrive to be innovative when it comes to tech and its implementation. There are many technologies which can be leveraged for the AML compliance. With a masters in Data Analytics I know the power of data driven decisions. Initially when we started with rule based systems I was wandering what they are capable of, but this is what your client believes in and that is the reality. Below are the few technologies which we are using so that always our clients stay ahead in compliance and be compliant.

Graph Analytics
Elastic Stack
Cloud Services
Artificial Intelligence

Graph Analytics:

As of today most of the transaction monitoring systems are rule based which fetch data from a relational database and run the data against a set of rules. Relational databases have certain limitations when it comes to handling large data and getting the real insights which support the decision of rule based system. We have a quite unique approach of handling the false positives using graph analytics. We mine the hidden patterns around the customer and his relations with others. Transaction behavioural patterns are also observed with some of the powerful graph algorithms. A MLRO should be 100% confident while making decision with supported reasons for generating SAR. Also these extracted graph features can be used to create rules and will come handy while training your machine learning models.

Elastic Stack:

We had a use case of global search functionality in our application, though it seems to be simple but there is lot querying needs to be done behind. As the complexity of these queries increases the execution time increases and hinders the user experience. It is then we have realised the importance of integrating Elastic Search with our application. It’s a NoSQL database which provides API’s for full text search and renders JSON response. It also offers advanced queries to perform detail analysis allowing many types of searches- structured, unstructured, geo, metric etc. Apart from a quick search, the tool also offers complex analytics and many advanced features. Provides horizontal scalability, reliability, and multitenant capability for real-time use of indexing to make it a faster search. Along with the search capabilities it offers a wide range of analytical abilities on the Elastic search database.

Kibana, a data visualisation tool which comes with the elastic stack. Powerful, real-time, front-end dashboard which comes with histograms, line graphs, pie charts, sunbursts, and more. Plus, you can use Vega grammar to design your own visualizations. These charts are easily configurable. Perform advanced time series analysis on your Elasticsearch data with curated time series UIs. Describe queries, transformations, and visualizations with powerful, easy-to-learn expressions. Can detect the anomalies hiding in your Elasticsearch data and explore the properties that significantly influence them with unsupervised machine learning features. This enhances the reporting abilities of the compliance team.

Cloud Services (Amazon Kinesis):

For a startup like us putting the maintenance costs as less as possible is of concern. Here comes the cloud services which are offered by different vendors. Starting from maintaining the code repos to the most sophisticated solutions like data streaming and Machine Learning are being offered. Thanks to AWS Activate for supporting the startups like us with their credits. Bringing up the service is a just matter of seconds on cloud. Here there is special reason for mentioning the cloud services and you may think how it can be used for compliance.

Compliance should be in real time, and the decision making should be aligned with the real time data in your systems. When I was brainstorming around this scenario I came across the Apache Kafka which helps to maintain real time data pipelines. Mostly transactional databases are used in banks and payment service providers, and for OLAP use cases synchronising this data with analytical databases is where we used Amazon Kinesis which is easy to maintain than Apache Kafka.I our case WAL logs of PostgreSQL have been pushed to the Kinesis stream using WAL2Json which then pushed to a S3 instance, then can be consumed by other applications like analytical databases. So you can proclaim yourself that your decision making is aligned with the real time data and also do online learning while training machine learning models.

This is one of the use case where cloud services comes handy. There is lot to extract from these services which brings down the cost of maintaining compliance teams. Mainly the maintenance costs of the complicated systems will drastically come down and even the time you spend on it. These resources can be used in decision making and mitigate the risk within the institution.

Artificial Intelligence:

There is a buzz around for use of artificial intelligence in transaction monitoring. One thing you should keep in mind is that artificial intelligence doesn’t mean machine learning. Instead it is subset of Artificial Intelligence. Recently I have been through many articles in which someone supports and others are against the use of machine learning in reg-tech. But striking the right chord at the right time is the key for the use of any tool in reg tech. There are many ways in which we can implement artificial intelligence in transaction monitoring.

After brainstorming on this topic we came up with a unique approach of feature engineering of the dataset, on which we trained our machine learning models. As I mentioned earlier we have used graph analytics from which we extracted graph features for node (i.e Account) . This sets the context for machine learning models. The data is updated in real time as I mentioned earlier and we used online training so that the models responds to the every change in incoming data and learns from it. There are some powerful algorithms in graphs like Personalised Page Rank which can be used identify the nodes which may involve in fraud. Also the path finding algorithms which determines the patterns of each node that is involved in.

In this way setting the context for the machine learning is of huge importance. Once this is done, then there are multiple powerful algorithms like Gradient Boosting and Ada boosting which can be used to build the classifiers. Here , a special mention to these algorithms because of their accuracy and precision which can be achieved by them. And the data needed for training these models is less compared to others. They are based on wisdom of the crowd. This machine learning model should be stacked along with other systems like rule based which helps in reducing the false positive rate significantly and also provides the feedback to the rule set which is already defined.

Conclusion:

There is enough technology around which can be used to enhance the power of your teams and mitigate the risk. This is a small attempt to address some of the useful technologies which are powerful and quite useful. There is more that needs to be done in reg-tech and will come up with some other interesting use-cases along with implementation next time.