Does Your Data have Culture?

Comma or period decimal numbers can wreak havoc. Find out how we eliminated this problem…

Floating point data allows for computations with decimal fraction numbers, allowing users to represent real-world values such as temperatures, pressures, flows, or currencies. When dealing with floating point data, it is important to understand the difference between the comma and period decimal characters, as different cultures often use one character over the other. It is important to recognize these differences to handle floating point data correctly.

Different cultures use either a comma or a period to represent numbers with fractions. For example, some cultures might use a comma for the decimal point, such as 3,14. Other cultures might use a period for the decimal point, such as 3.14. Worse, the comma culture uses a decimal point as a thousand separator, and the period culture uses a comma, mixing both characters routinely, potentially causing chaos and serious computational errors.

Using the wrong decimal character can have a significant impact. Adding 3.14 to 4,13 may result in an error, or interpreting the comma decimal 1,459393 in a period decimal system NOT 1.459393 is a grossly WRONG value of 1459393.0.

In addition, some programming languages may always use a period as the decimal character, while others may default to a comma based on the regional settings of the user’s computer. This means that even if you work with data that uses one decimal character, your software may use a different character by default, leading to errors in your computations.

Worse, there is no single standard, as ISO (International Standards Organization) says both are OK.

As our Intellect data access, cleansing, transformation, modeling, prediction, and optimization architecture expands globally, encountering cultural differences in data generating and storage systems is common. Our software may run on US Windows with a period decimal, receiving data from a SCADA system with comma decimal numbers. A text file might be comma decimal, semi-colon delimited, while another may be period decimal, comma delimited. Our customers are increasingly asking for our web interfaces to be multi-lingual, showing at times period decimal, and other times comma decimal.

We eliminate these issues by establishing standards. As ISO is “Either / Or” on this issue and as we are a USA company running on USA Windows, more often than not, we have established an internal period decimal standard. All data access coming in and out is identified, typically at the data source level as either comma or period decimal, and we convert the data properly transparent to the user.

Overall, the decimal character used in floating point data may seem minor, but it can impact your data processing and computations. By understanding the differences between comma and period characters and establishing standards, you can ensure that your calculations are accurate and avoid costly errors.

Related Posts