Customs data analysis

10th September 2021 By: Riaan de Lange

The World Customs Organisation’s (WCO’s) 2018 ‘Draft Guidance on Data Analytics’ describes data analytics as the process of analysing datasets to discover or uncover patterns, associations and anomalies from sets of structured or unstructured data and to draw practical conclusions. The document recommends the adoption of data analytics strategies within customs administrations to improve the use of available data and information in order to expedite their decision-making.

This brings us to ‘BAnd of CUstoms Data Analysts (Bacuda), launched in September 2019. Bacuda is a collaborative research project between the WCO secretariat, its members and data scientists. Its objective is to develop data analytics methodologies, including algorithms in open-source programming languages, namely R1 or Python. The reason for being written in open-source languages that are shared and explained on the WCO website is to enable all customs administrations to deploy them with their own data.

As part of Bacuda, the WCO has developed a neural network model called Dual-Attentive-Tree-aware-Embedded (DATE) to help customs administrations better detect fraud-risky transactions. DATE can be downloaded from the WCO website.

To develop the algorithms, Bacuda analysts use customs data at the most disaggregated level, that is, the transaction level. Such data is collected from customs administrations wishing to support the project and is then anonymised to ensure confidentiality. Moreover, experts who access the anonymised data have to sign a confidentiality statement, and the preliminary results of any research are first released to the data owners for their approval before publication.

The WCO states that the potential success of the project is attributable to access to a huge amount of data at the transaction level, although Bacuda experts also work with open-source data, which is not limited to macroeconomic or geographical and spatial data sourced from international organisations. It also includes public-domain satellite images published by spatial and military agencies. Bacuda experts also make use of platforms that track the movement of means of conveyance, such as planes, as well as criminal activities or specific events. Thanks to these datasets, it is possible to gain a fairly good understanding of border-related activities and supply chains.

Through the use of text-mining and Web-scraping tools, unstructured data can be extracted from Web pages or social networking sites and then analysed. Thus, price data on online shopping platforms can be cross-referenced to assess the conformity of the declared value of an item for customs valuation purposes.

To develop and test the algorithms that they have designed, Bacuda analysts have at their disposal two powerful computers linked to a cloud server, thanks to the generous support of the Korea Customs Service.

The Bacuda team has already developed basic methods and algorithms for: mirror data analysis with R and Shiny, forecasting customs revenue, revenue gap analysis, Web scraping of price data, and customs fraud detection by machine learning with Random Forest and Python.

Access to this information is restricted, and interested readers who do not have a user account for the WCO website are invited to submit an access request to the WCO if they are customs officials, or to contact the WCO Research Unit if they have another public or private function.

In conclusion, the WCO stresses that the project team is ready to solve all kinds of puzzles for customs administrations, not only issues related to enforcement per se. They could, for example, help to develop chatbots to advise importers on how to classify their goods, or how to calculate duties and taxes that apply to their trade operations.

Another area where Bacuda experts could be of use is in measuring customs performance by text-mining comments appearing on social media networks, instead of using traditional surveys.

For more insights on the topic, please read Danilo Desiderio’s article, ‘Data analysis techniques for enhancing the performance of customs’.