Common mistakes to avoid while using big data in risk management

简介: Managing risk is a challenging enterprise, and errors are often made which can lead to catastrophic consequences.

8c42b3fd7874ad02af9b91dec01c8712138fbb8d_jpeg

Managing risk is a challenging enterprise, and errors are often made which can lead to catastrophic consequences. Today, big data analytics using digital tools like Hadoop or Splunk has seen an uptick amongst corporations looking to mitigate risk. There's an optimism that reviewing big data can yield insights that can help manage risk more effectively and thus prevent disasters such as the 2008 financial crisis. For example, many banks are now performing real-time analytics on customer data such as credit history, transaction history and employment history to more accurately determine which segment of customers represent a high or low risk for being given a mortgage or loan.

In the same way, numerous product manufacturers are utilizing big data analytics in order to determine their customers' likes and dislikes, enabling them to create products that meet their customers' specific tastes. Doctors are using big data to determine high risk patients who require more immediate care. The energy industry is using big data to spot problems in the production process early on before they develop into something unmanageable. And the list goes on across a plethora of different industries.

Nevertheless, while big data offers tremendous potential to manage risk across many industries and sectors, it's important to avoid common mistakes when handling said data. These could produce inaccurate results that will enhance risk if instead of reducing it.

Using incomplete or irrelevant data

Data scientists must ensure the data they are using is a relevant and complete representation of what they want to analyze (such as customer behavior, or oil pressures). Using incomplete or skewed data sets can lead to erroneous conclusions that will undermine risk management.

Using data that's not up-to-date

Historical data is important for generating insights to manage risk. However, it is recommended to also incorporate the most up-to-date data available, preferably in real time, for the most accurate insights. With the world is continually in flux, what was true yesterday may not be true today.

Not taking into account all the key variables

A frequent mistake when performing big data analytics is not including all the pertinent variables in the calculations. Data scientists must ensure that all relevant variables (e.g. customer income, credit history and employment history for evaluating mortgage suitability) are captured, since even one missing variable can dramatically alter the accuracy of the result. Deciding what the pertinent variables are is not always straightforward, often requiring deep thought as well as even trial and error iteration.

Selection bias

Perhaps the most serious mistake of all is cherry-picking the data set to produce results which are skewed based on the analyst's bias. Data scientists must be very careful to not let their subjective views affect what data sets they select for evaluation. This point seems highly relevant in today's era of 'fake news', where people listen to news which they want to be factual, even if it's not. The same principle applies to big data analytics.

目录
相关文章
|
Oracle 关系型数据库 Unix
|
安全
Information Systems Security Assessment – Open information security framework
The Information Systems Security Assessment Framework (ISSAF) seeks to integrate the following m...
982 0
The Rising Smart Logistics Industry: How to Use Big Data to Improve Efficiency and Save Costs
This whitepaper will examine Alibaba Cloud’s Cainiao smart logistics cloud and Big Data powered platform and the underlying strategies used to optimiz.
1538 0
The Rising Smart Logistics Industry: How to Use Big Data to Improve Efficiency and Save Costs
|
安全
How Important is Data Security for the Financial Industry?
90% of financial companies worldwide think they have data security risks. What security problems do financial industry users typically encounter?
2019 0
|
SQL 分布式计算 分布式数据库
Big Data Application Case Study – Technical Architecture of a Big Data Platform
How should we design the architecture of a big data platform? Are there any good use cases for this architecture?
2249 0
|
分布式计算 MySQL 关系型数据库
Implementing a Highly-Compressed Data Storage
Alibaba Cloud ApsaraDB for RDS for MySQL supports the TokuDB engine to store data that is compressed to 5 to 10 times smaller than its original size.
1740 0