Common mistakes to avoid while using big data in risk management

简介: Managing risk is a challenging enterprise, and errors are often made which can lead to catastrophic consequences.

8c42b3fd7874ad02af9b91dec01c8712138fbb8d_jpeg

Managing risk is a challenging enterprise, and errors are often made which can lead to catastrophic consequences. Today, big data analytics using digital tools like Hadoop or Splunk has seen an uptick amongst corporations looking to mitigate risk. There's an optimism that reviewing big data can yield insights that can help manage risk more effectively and thus prevent disasters such as the 2008 financial crisis. For example, many banks are now performing real-time analytics on customer data such as credit history, transaction history and employment history to more accurately determine which segment of customers represent a high or low risk for being given a mortgage or loan.

In the same way, numerous product manufacturers are utilizing big data analytics in order to determine their customers' likes and dislikes, enabling them to create products that meet their customers' specific tastes. Doctors are using big data to determine high risk patients who require more immediate care. The energy industry is using big data to spot problems in the production process early on before they develop into something unmanageable. And the list goes on across a plethora of different industries.

Nevertheless, while big data offers tremendous potential to manage risk across many industries and sectors, it's important to avoid common mistakes when handling said data. These could produce inaccurate results that will enhance risk if instead of reducing it.

Using incomplete or irrelevant data

Data scientists must ensure the data they are using is a relevant and complete representation of what they want to analyze (such as customer behavior, or oil pressures). Using incomplete or skewed data sets can lead to erroneous conclusions that will undermine risk management.

Using data that's not up-to-date

Historical data is important for generating insights to manage risk. However, it is recommended to also incorporate the most up-to-date data available, preferably in real time, for the most accurate insights. With the world is continually in flux, what was true yesterday may not be true today.

Not taking into account all the key variables

A frequent mistake when performing big data analytics is not including all the pertinent variables in the calculations. Data scientists must ensure that all relevant variables (e.g. customer income, credit history and employment history for evaluating mortgage suitability) are captured, since even one missing variable can dramatically alter the accuracy of the result. Deciding what the pertinent variables are is not always straightforward, often requiring deep thought as well as even trial and error iteration.

Selection bias

Perhaps the most serious mistake of all is cherry-picking the data set to produce results which are skewed based on the analyst's bias. Data scientists must be very careful to not let their subjective views affect what data sets they select for evaluation. This point seems highly relevant in today's era of 'fake news', where people listen to news which they want to be factual, even if it's not. The same principle applies to big data analytics.

目录
相关文章
|
4月前
|
Web App开发 Java 网络安全
PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
801 0
MGA (Managed Global Area) Reference Note (Doc ID 2638904.1)
MGA (Managed Global Area) Reference Note (Doc ID 2638904.1)
337 0
|
云计算
Google Earth Engine(GEE)——Error: Exported bands must have compatible data types; found inconsistent
Google Earth Engine(GEE)——Error: Exported bands must have compatible data types; found inconsistent
520 0
Google Earth Engine(GEE)——Error: Exported bands must have compatible data types; found inconsistent
SAP WM Storage Type Capacity Check Method 5 (Usage check based on SUT)
SAP WM Storage Type Capacity Check Method 5 (Usage check based on SUT)
SAP WM Storage Type Capacity Check Method 5 (Usage check based on SUT)
|
传感器 关系型数据库 PostgreSQL
Real-time Monitoring and Alerts for Senior Citizens - Big Data for Healthcare
This article discusses Alibaba Cloud PostgreSQL best practices for healthcare applications. In particular, we will explore how Big Data can be applied.
2517 0
Real-time Monitoring and Alerts for Senior Citizens - Big Data for Healthcare