Friday, March 30, 2012

Not a Number Values in the Database

Our application has a table, which is populated via ADO.Net from C# with data originating from a C++ COM call. Today I encountered an entry that is C++ code for an undefined value: -1.#IND stored in the database. However, I could only discover what was stored in the table by Casting the value to a varchar -- simply selecting returned an error.

Is this expected behavior or a bug? It does not seem correct that SQL Server should store a value that cannot be displayed. In essence, either the value should not be allowed in the table because it violated the domain or SQL Server ought to have a way to display it with a Select *.

As fas as our application is concerned, we will be masking these values -- initially by ignoring them in the queries and eventually the loading program will convert to null.

What is the defined datatype for the field?|||

The column in question is a float column.

I would argue that the datatype should not be that important. The server should not allow a value to be stored which cannot be displayed.

|||

IND could be (COULD BE) referencing the unicode index control character. It looks like you had some sort of odd data transformation from the COM-C++-C#-SQL conversion process, especially considering it's a floating point variable which can lose precision through casting, it didn't surprise me to see that that was the data type in question.

Have you considered testing that single row by running it through your conversion process again by itself? Is the original float data for the column and row something... unusual?

|||

Hmm, very interesting. Try casting it to a numeric of some sort (particularly really large precision after the decimal)... That might give the clue. What datatype are you using in your C# application? And what prints out when you select from the column in Management Studio?

I can't imagine there is any value that is illegal for a float, but stranger things have happened.

|||

The specific data in the database was purged, due to some maintenance, so I'll have to see if it returns.

However, I made a simple example:

static void Main(string[] args)

{

double d = Double.NegativeInfinity;

using( SqlConnection conn = new SqlConnection( "server=(local);initial catalog=Northwind;integrated security=true" ) )

{

SqlCommand cmd = new SqlCommand( "Insert Foo(zero, one) Values( Datepart(mi, getdate() ),@.val )", conn );

cmd.Parameters.Add( "@.val", d );

conn.Open();

int rows = cmd.ExecuteNonQuery();

Console.WriteLine( rows );

}

Console.WriteLine( d );

}

with a target table:

Create Table Foo(

zero int not null,

one float null

)

Using any of Double.NegativeInfinity, Double.PositiveInfinity, or Double.NaN gives an odd result. The code succeeds and reports that a record has been added. However, the command Select * from Foo will not return records with these values. The values cannot be displayed by Casting to a numeric type, but output is produced by Casting to a varchar.

Using Select Count(*) from Foo will report the correct number of records.

|||

If you worked with MDX queries (analysis server), it is very common issue.

When you store the INDEFINITE/INFINTE value into your variable, compiler uses the special values to identify the INDEFINITE/INFINTE number/value,

1. 1.#IND (positive indefinite) or -1.#IND (negative indefinite)

2. 1.#INF (positive infinite)or -1.#INF (negative infinite)

Both are used to identify the compiler the variables hold the INDEFINITE/INFINTE, usally when you try to divide by zero it will be throw an error. But here you are forcefully stroring the infinite value. It is the compiler specific data. It can’t trusted across the compilers. So when you execute the code on your .NET framework will work fine. When try to pass this value into SQL Server it doesn’t understand, there is no specification available to identify the indefinite value in database.

Logically you can have nothing in your column (NULL/ZER0), but you are not allowed to store indefinite value in your column. Indefinite values are not comparable, these are imaginary values. But database store the masseurable values.

Solution:

Before passing the value to your database, validate the data, if it is INDEFINITE/INFINTE reset the value to NULL or other identifiable value for your future logics...

|||

I know you are a moderator, but as a general rule the thread creator should close threads. Being a moderator does not automatically make your information infallible.

As a matter of fact, on this issue you did not really add any additional information to the thread other than that you have experienced the issue as well.

I would now contend that this is, in fact, a bug in the SQL Server implementation. Either these values -- NaN, PositiveInfinity, and NegativeInfinity -- should be excluded from the domain of the Float datatype and rejected on Insert; or they are included in the domain and should be handled in a logical and consistent manner -- certainly they should have a display representation and they should have defined behavior in functions.

The current implementation is just a landmine waiting to go off at inopportune times.

|||

I marked the thread as 'answered'. As a 'general rule', we all take responsibitly to close a thread when there doesn't seem to be anymore 'interest' and/or it appears to be adequately 'answered'. And anyone can choose to 're-open' that thread by 'unmarking' the marked 'answer'. It's just a 'housekeeping' matter to help focus limited resources (the time and energy of volunteers) to the active and unanswered threads.

It appeared that Mani provided an adequate explanition for the behavior, and it seemed to me to be doubtful if there could be a 'solution' posted for your question. It appears that there to be a 'anomoly' involved in this situation -concerning a C++ interaction with the datastore, but I haven't been able to locate a declared 'bug' about the issue. (Doens't mean it's not there, just that I haven't devoted enough time to find it if it exists.) Mani's response satisfied the need to provide information to subsequent readers of the thread.

My mistake was in not encouraging you to post a 'potential' bug. I'll correct that now.

Please go to: http://connect.microsoft.com/sqlserver and post your observations and concerns. With everyone's keen interest and help to provide feedback and critique, the SQL Server will only continue to get better.

If you want this thread to stay 'open' to see if additional comments are forthcoming, that is perfectly fine.

As a postscript, let me add that there is often an issue of 'the' answer. We work on the concept of helping the questioner find a way to solve his/her problem. At some point, one has to say, ok, maybe there is a 'better' answer, but what is here now works, so lets move on.

|||

I have found more information on the issue:

First to clarify, the issue is not restricted to C++ interaction. The code I show above is C# and can be used to demonstrate the behavior when NaN or Infinity is stored in the database. Furthermore, I was surprised to find out that the data can be read out with a query from a C# program. My code and example were run against Sql Server 2000.

Looking on Microsoft Connections, I found an interesting item. It appears that MS has chosen to no longer support NaN and Infinity in the database in SQL Server 2005. In fact, a request had been made to restore the 2000 functionality. There is, at least, one organization which is using SQL Server to store engineering data and who has the need to occassionally store Infinities in the database. Evidently they are not performing any ad hoc queries or calculation within the database.

I plan to post a recommendation that MS embrace full support for the IEEE floating point numbers rather than simply disallowing them. It seems like the best, long term, behavior for the database and would be keeping in line with industry trends. Ideally, some form of patch would be made available for SQL 2000 to more gracefully handle NaN or Infinity when it is entered in the database.

|||Yes, this is a behavior change from SQL Server 2000. We now validate float values and Unicode data when you pass them via RPC calls or Bulk Copy APIs. So for float even though values like NaN and Infinity is allowed in the IEEE 754 spec (floating point implementation) we do not allow it now. Upon upgrade to SQL Server 2005 from older versions you need to modify the data in your tables to use them within TSQL. TSQL does not support any way to query these type of data or manipulate it explicitly - older SQL Server versions used to just allow them to be stored but you can only manipulate it on the client-side.|||

Is there a strong reason to not implement full support for the IEEE 754 spec? It specifies the rules for all operations involving the extra values. Some external representation of the values would be needed, but that has been solved for C# already.

It is a little surprising that it is not easier to support them than to prevent them. I assume that much of the Server software is written in a high level language that supports IEEE floats.

With competing products already supporting IEEE floats, the change seem inevitable.

Thanks for your input.

|||

I agree with your assessment. It is really unfortunate that we do not support it yet. Did you vote on the connect bug at https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=239674? I would also encourage anyone reading this thread to vote on this bug. The issue today is that several built-ins in SQL Server (those that work with floating point data) and other facilities cannot handle these values. I will follow-up with the owner of this bug and see if we can do anything about it in SQL Server 2008.

sql

No comments:

Post a Comment