Data vailidity, integrity and consistency: Useful tools to safeguard revenue

Share This Post

Share on facebook
Share on linkedin
Share on twitter
Share on email


Measurement data has always been an integral part of industrial workplace operations, with automation, particularly via sensors, now taking a bigger role. All systems, plants, oil and gas fields, etc. are set up to collect a large number of measurements, such as pressure, temperature, flow rate, composition, and more. The availability of accurate data is essential for maintaining optimal and safe operating conditions; but, in most cases, significant monetary consequences are also involved. The risks of inaccurate measurements are often (grossly) underestimated and, therefore, insufficient countermeasures are taken, leading to a false sense of security. This paper sheds light on the root causes of most measurement-related errors and some countermeasures that can be applied to mitigate problems, including the potential loss of revenue.

Background and definitions

We will use the terms data validity, data integrity, and data consistency as follows:
Data validity data stay within expected limits
Data integrity data behave as expected, based on system properties and measurement, itself
Data consistency data are consistent with other system information, both past, and present

We will illustrate these terms further with an example from aviation.

Sensor failure: Turkish Airlines crash of 2009

power. In reality, the plane was still some 250 m above the ground, as the other three sensors had correctly detected it.

The loss of engine power resulted in a slowing down of the plane to below its stall speed and, subsequently, the plane dropped like a brick! When the three pilots finally found out what was happening and put the power back on the engines, it was too late. Nine people had lost their lives, a plane was destroyed and, consequently, Turkish Airline’s reputation was at stake. Data validity, integrity, and consistency countermeasures would have easily avoided this accident; especially if the information was available in almost real-time to allow data-consistency checks to provide early alerts.

Comparing the results across four height measurements would have revealed that up to a certain moment in time, they were generally in agreement until suddenly, one gave a completely different reading. This too should have set off alarms, certainly when it is the (only) sensor that is used by the autopilot. The software, obviously not designed to check data validity, data integrity, or data consistency, let alone all three factors, ordered the autopilot to finish the landing, which did happen but not in an intended way.

So, based on the available information and by simply checking the data validity, consistency and integrity, the only viable conclusion should have been that the height measurement was in error and, therefore, its information should have been ignored and the auto-pilot returned to the control of the human pilots. This serious case of negligence resulted in a major disaster!

Sensor errors: underlying causes

Although the complete breakdown of a sensor is usually easy to detect, other, more sneaky effects can also happen, as the gradual decrease of sensor performance. This is illustrated in the following examples:

Sensors are calibrated before they leave the factory. Usually, the manufacturer specifies a maximum time span, after which the sensor should be re-calibrated or at least its performance checked. As the calibration curve tends to creep, it gradually changes over time. In contrast to the Turkish Airlines example, reading does not change in a few tenths of a second, but over months or even longer; whereas, the reading itself stays within the expected values. The result, though, is a systematic error, which can have grave financial consequences, as we will see below.

Another example is a change in the calibration curve or the zero-setting of a sensor, caused by a (short) overload condition. A well-known example is a bent orifice plate, possibly caused by a liquid slug hitting the orifice during start-up or a misoperation. The orifice will still work (meaning it will generate a differential pressure), but its discharge coefficient is likely to be different, so the relation between differential pressure and flow rate will show a systematic error. Until the orifice has been inspected and/or calibrated, this will go unnoticed without data consistency verification. Many sensors/transmitters provide diagnostic data next to the actual measurement, but such data are rarely used in a wider perspective than the sensor, itself. However, it often carries valuable information, which rarely, if ever, includes sensors, and is specifically meant for diagnostic purposes of the system as a whole.

Similarly, with other sensors, although the time span specified by the manufacturer has been exceeded, it is mostly assumed that these still work OK as long as they give numbers. But, a gradual change in the calibration will result in systematic errors and, thus, severe risks to revenue. What can be done about this will be discussed in the next section?

Oil & Gas industry: the need for accurate data

Concerning data measurement accuracy, the oil and gas industry tends to cling to some long-held misconceptions:

A common misunderstanding is that errors will average out. This is, however, only the case with random errors; systematic errors build-up, as they always point in the same direction.

Another misunderstanding is the concept of uncertainty, as it can not be seen. No bookkeeper, for example, will add the uncertainty of the (measured) revenues in the books, but that does not mean it is not there!

A third misunderstanding is that correct metering is expensive and, thus, such efforts can be reduced. But, people forget the value of the information behind the metering results.

The complexity of ownership is another growing concern with oil and gas plays. In the old days, one single company owned a reservoir, produced the oil and gas, and sold it (and usually the export meters were accurate). So, there was no need (people thought) for accurate measurements upstream. Nowadays, the situation is far more complicated. Even in the simplest case, when everything is owned by one company, the assumption that no accurate measurements upstream are necessary is incorrect. This can be best understood by taking a more holistic view of the system.

The oil or gas reservoir, itself, is a complex system below the surface. To start the production of hydrocarbons, wells need to be drilled at several locations across the field. These locations are carefully chosen based on seismic data. But, the composition at various locations in the reservoir is usually not the same. They can vary from gas (in the gas cap) to light oil (close to the gas cap) and watery heavy oil near the aquifer. The operator is always facing the challenge to get the highest ultimate recovery (the total fraction of hydrocarbons, recovered from the reservoir before it is abandoned) at the lowest costs. This requires not only knowledge of the initial situation in the reservoir, but also the history of the production from the reservoir. If this is not (accurately) available, less than optimal production may result, thus, reducing the ultimate recovery and/or higher costs and energy usage for secondary and tertiary recovery techniques. As the lifetime of a reservoir is typically 25 years, a holistic approach needs to be followed from design to abandonment, and the quality of the data needs constant attention during these years.

The current situation upstream, however, is usually even more complex when it comes to ownership. A single company rarely owns a reservoir; it is mostly owned by several entities (either by a joint venture or because the field extends over several different concession blocks). And in remote areas or subsea developments, not every field has its own flow line to the processing plant. So, the production of different reservoirs, with different compositions, owned by different companies, is commingled and will be mixed; whereby, thermodynamic equilibria will shift (e.g., gas will come out of solution). At the processing facility, the proportion of each company to the different export streams (with different monetary values) needs to be determined and the revenues split. And this is all based on the input data from the measurements of the evacuation system. In essence, these data do not merely measure oil or gas, they measure $$$$$.
Again, it is often assumed that the measurement uncertainties will average out, but because the majority of the errors are systematic, this is not the case; they add up! So, the split of the revenues will be systematically biased. Note that this means some companies may receive too large a share of the revenues, while another may receive too little: it is a zero-sum game. The problem is further exacerbated by the fact that some hydrocarbons are more valuable than others; not only produced volumes come into play but also the composition of the different fluids is crucial.

Potential financial consequences

Another underestimated issue is the effect of uncertainties on the corporations’ revenue. It is often argued that it’s only 1%, so that’s not too bad. The problem is that the revenue is not only what is collected at the export meter but also depends on what the costs are to get it there. The net revenue is the difference between the gross revenue and the costs of production and transportation, making the net revenue much more influenced by the uncertainty than the gross, particularly for marginal fields. Thus, accurate data will provide a more honest share of the revenues and help to act as a trusted business partner.


A good approach to ensure accurate data is to apply the countermeasures of data validity, data integrity, and data consistency. Let’s take a look at each of these in action.

Data validity: First, the data should be checked to determine whether or not the results lie in the range of values to be expected. Values like negative pressures, too high or too low temperatures, and flows going from low to high pressures should be detected A.S.A.P., and the cause analyzed and fixed. Again, it is often thought of measurement sensors that it’s OK as long as it gives numbers. This is questionable, as the calibration of sensors tends to drift and (undetected) damages can introduce systematic errors, even if the sensors still give numbers. Once the calibration of a sensor has expired, measures should be taken to verify the operation and quality of the data. Data integrity can also be helpful and even detect such deviations before calibration has expired.

Data integrity: Changes, both rapid and gradual, can indicate deterioration of the quality or the calibration of a sensor. Such changes should be explained by changes in the system, like a change in flow rate, well pressure, etc., or other causes like obstructions/deposits in the flow line. But, if these changes cannot be easily explained, they are likely caused by calibration drift or a zero-setting error of the sensor. The root cause should be further investigated, leveraging account information from other parts of the system to perform cross-functional data consistency verification.

Data consistency: All available information from the system should be consistent, period! Flows should go from high to lower pressure, temperatures should decrease when heat is leaking away, flow rates should relate to pressure differentials, and so forth. Inconsistencies point to sensor error (including the impulse lines of pressure sensors) and should be investigated. Historic data can be helpful to find where the problems are located.
It is impossible in this brief paper to illustrate all possible options, as these are system-dependent. But, it should be clear that a holistic view of the system (both in place and time) enables more accurate monitoring of the sensors and the system, by using all available information and the relationships between the readings at different positions within the system.


Accurate data are crucial for the optimum and safe operation of systems, to minimize operational costs, and to obtain an honest share of revenues in complex evacuation scenarios. To obtain and maintain accurate data, a holistic approach is a major step forward, incorporating data validity, integrity, and consistency, where possible, in combination with diagnostic data from individual sensors over the entire system. To realize such a holistic analysis, detailed information on the system needs to be available. The analysis, itself, can be created by a combination of routines for a software library and using a data-historian, for example. Hint B.V. (Wapenveld, Netherlands) can help you to set up such a system for your specific application.

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

Asset information management
Blog Post

Letting the Data Dictate Your Decisions

Algorithms determine so much of what happens to us. Our email accounts and social media feeds are dominated by messages sent from companies who used