Monday, March 26, 2012

Normalization Problem

Could someone have a look at this normalization problem for me? Specifally,
what rule is being broken by the first normalization attempt? I know it's
wrong, but why?
/*
DROP TABLE #farm_spreadsheet
DROP TABLE #farm_animals
DROP TABLE #farm_animals2
DROP TABLE #farms
*/
-- Original Farm Spreadsheet
CREATE TABLE #farm_spreadsheet ( farm_id INT, farm_name VARCHAR(50),
farm_animals VARCHAR( 500 ) )
SET NOCOUNT ON
INSERT INTO #farm_spreadsheet VALUES ( 1, 'Farmer Giles'' Farm', '2 Hens, 1
Goat' )
INSERT INTO #farm_spreadsheet VALUES ( 2, 'Jimmy''s Farm', '1 sick Goat' )
-- My lazy colleague wants to 'normalize' to this, but what rules is he
breaking?
CREATE TABLE #farms ( farm_id INT PRIMARY KEY, farm_name VARCHAR(50) NOT
NULL )
CREATE TABLE #farm_animals ( farm_id INT, farm_animal VARCHAR(50) NOT NULL,
animal_count INT )
INSERT INTO #farms
SELECT farm_id, farm_name
FROM #farm_spreadsheet
INSERT INTO #farm_animals VALUES ( 1, 'Hen', 2 )
INSERT INTO #farm_animals VALUES ( 1, 'Goat', 1 )
INSERT INTO #farm_animals VALUES ( 2, 'Goat', 1 )
SET NOCOUNT OFF
-- Try and reproduce original spreadsheet to prove lossless decomposition
SELECT f.farm_id, f.farm_name, fa.farm_animal, fa.animal_count
FROM #farms f
INNER JOIN #farm_animals fa ON f.farm_id = fa.farm_id
ORDER BY f.farm_id, fa.farm_animal
GO
-- I want to normalise to this, but is it any better?
CREATE TABLE #farm_animals2 ( farm_id INT, farm_animal VARCHAR(50) NOT NULL,
animal_status VARCHAR(10) )
SET NOCOUNT ON
INSERT INTO #farm_animals2 VALUES ( 1, 'Hen', 'OK' )
INSERT INTO #farm_animals2 VALUES ( 1, 'Hen', 'OK' )
INSERT INTO #farm_animals2 VALUES ( 1, 'Goat', 'OK' )
INSERT INTO #farm_animals2 VALUES ( 2, 'Goat', 'Sick' )
SET NOCOUNT OFF
-- Try and reproduce original spreadsheet to prove lossless decomposition
SELECT f.farm_id, f.farm_name, fa.farm_animal, COUNT(*) AS animal_count
FROM #farms f
INNER JOIN #farm_animals2 fa ON f.farm_id = fa.farm_id
GROUP BY f.farm_id, f.farm_name, fa.farm_animal
ORDER BY f.farm_id, fa.farm_animalHi Damien,
In simple terms, first normal form is "No repeating group of elements".
So, to my knowledge, the first approach using #farms and #farm_animals
agrees with the first normal form.
and the second approach with #farm_animals2 has repeating rows. And if you
ask me its in the -1 normal form :) No offence.
Hope this helps.|||Your "lazy" colleague is not breaking any normalization rules with the
#farm_animal table.
#farm_animals2, on the other hand, has no way to tell one row from
another. If each individual animal is going to be identified by an
individual row, you need some way to uniquely identify each of them.
Some sort of animal_id column, for example.
The most bothersome issue between the two alternatives is that they
have different information. It seems rather premature to be talking
about tables when you have not even decided on the facts, such as
animal_status, to be stored.
Roy Harvey
Beacon Falls, CT
On Wed, 26 Apr 2006 02:57:01 -0700, Damien
<Damien@.discussions.microsoft.com> wrote:

>Could someone have a look at this normalization problem for me? Specifally
,
>what rule is being broken by the first normalization attempt? I know it's
>wrong, but why?
>
>/*
>DROP TABLE #farm_spreadsheet
>DROP TABLE #farm_animals
>DROP TABLE #farm_animals2
>DROP TABLE #farms
>*/
>-- Original Farm Spreadsheet
>CREATE TABLE #farm_spreadsheet ( farm_id INT, farm_name VARCHAR(50),
>farm_animals VARCHAR( 500 ) )
>SET NOCOUNT ON
>INSERT INTO #farm_spreadsheet VALUES ( 1, 'Farmer Giles'' Farm', '2 Hens, 1
>Goat' )
>INSERT INTO #farm_spreadsheet VALUES ( 2, 'Jimmy''s Farm', '1 sick Goat' )
>
>-- My lazy colleague wants to 'normalize' to this, but what rules is he
>breaking?
>CREATE TABLE #farms ( farm_id INT PRIMARY KEY, farm_name VARCHAR(50) NOT
>NULL )
>CREATE TABLE #farm_animals ( farm_id INT, farm_animal VARCHAR(50) NOT NULL,
>animal_count INT )
>INSERT INTO #farms
>SELECT farm_id, farm_name
>FROM #farm_spreadsheet
>INSERT INTO #farm_animals VALUES ( 1, 'Hen', 2 )
>INSERT INTO #farm_animals VALUES ( 1, 'Goat', 1 )
>INSERT INTO #farm_animals VALUES ( 2, 'Goat', 1 )
>SET NOCOUNT OFF
>-- Try and reproduce original spreadsheet to prove lossless decomposition
>SELECT f.farm_id, f.farm_name, fa.farm_animal, fa.animal_count
>FROM #farms f
> INNER JOIN #farm_animals fa ON f.farm_id = fa.farm_id
>ORDER BY f.farm_id, fa.farm_animal
>GO
>
>-- I want to normalise to this, but is it any better?
>CREATE TABLE #farm_animals2 ( farm_id INT, farm_animal VARCHAR(50) NOT NULL
,
>animal_status VARCHAR(10) )
>SET NOCOUNT ON
>INSERT INTO #farm_animals2 VALUES ( 1, 'Hen', 'OK' )
>INSERT INTO #farm_animals2 VALUES ( 1, 'Hen', 'OK' )
>INSERT INTO #farm_animals2 VALUES ( 1, 'Goat', 'OK' )
>INSERT INTO #farm_animals2 VALUES ( 2, 'Goat', 'Sick' )
>
>SET NOCOUNT OFF
>
>-- Try and reproduce original spreadsheet to prove lossless decomposition
>SELECT f.farm_id, f.farm_name, fa.farm_animal, COUNT(*) AS animal_count
>FROM #farms f
> INNER JOIN #farm_animals2 fa ON f.farm_id = fa.farm_id
>GROUP BY f.farm_id, f.farm_name, fa.farm_animal
>ORDER BY f.farm_id, fa.farm_animal|||How about :
CREATE TABLE dbo.farms (
farm_id int PRIMARY KEY ,
farm_name varchar (50)
)
CREATE TABLE dbo.farm_animals3 (
farm_id int ,
Animal_ID int
PRIMARY KEY (farm_id, Animal_ID) ,
animal_type varchar (50) ,
animal_status varchar (10)
CONSTRAINT FK_farm_animals3_farms FOREIGN KEY
(farm_id) REFERENCES dbo.farms (farm_id)
)
Roy makes a good point that until you have decided exactly which data you
are storing, you can't really begin working on the normalization.
"Damien" <Damien@.discussions.microsoft.com> wrote in message
news:DBC8BEC8-E9A7-4C0D-BF4A-4596433B6212@.microsoft.com...
> Could someone have a look at this normalization problem for me?
Specifally,
> what rule is being broken by the first normalization attempt? I know it's
> wrong, but why?
>
> /*
> DROP TABLE #farm_spreadsheet
> DROP TABLE #farm_animals
> DROP TABLE #farm_animals2
> DROP TABLE #farms
> */
> -- Original Farm Spreadsheet
> CREATE TABLE #farm_spreadsheet ( farm_id INT, farm_name VARCHAR(50),
> farm_animals VARCHAR( 500 ) )
> SET NOCOUNT ON
> INSERT INTO #farm_spreadsheet VALUES ( 1, 'Farmer Giles'' Farm', '2 Hens,
1
> Goat' )
> INSERT INTO #farm_spreadsheet VALUES ( 2, 'Jimmy''s Farm', '1 sick Goat' )
>
> -- My lazy colleague wants to 'normalize' to this, but what rules is he
> breaking?
> CREATE TABLE #farms ( farm_id INT PRIMARY KEY, farm_name VARCHAR(50) NOT
> NULL )
> CREATE TABLE #farm_animals ( farm_id INT, farm_animal VARCHAR(50) NOT
NULL,
> animal_count INT )
> INSERT INTO #farms
> SELECT farm_id, farm_name
> FROM #farm_spreadsheet
> INSERT INTO #farm_animals VALUES ( 1, 'Hen', 2 )
> INSERT INTO #farm_animals VALUES ( 1, 'Goat', 1 )
> INSERT INTO #farm_animals VALUES ( 2, 'Goat', 1 )
> SET NOCOUNT OFF
> -- Try and reproduce original spreadsheet to prove lossless decomposition
> SELECT f.farm_id, f.farm_name, fa.farm_animal, fa.animal_count
> FROM #farms f
> INNER JOIN #farm_animals fa ON f.farm_id = fa.farm_id
> ORDER BY f.farm_id, fa.farm_animal
> GO
>
> -- I want to normalise to this, but is it any better?
> CREATE TABLE #farm_animals2 ( farm_id INT, farm_animal VARCHAR(50) NOT
NULL,
> animal_status VARCHAR(10) )
> SET NOCOUNT ON
> INSERT INTO #farm_animals2 VALUES ( 1, 'Hen', 'OK' )
> INSERT INTO #farm_animals2 VALUES ( 1, 'Hen', 'OK' )
> INSERT INTO #farm_animals2 VALUES ( 1, 'Goat', 'OK' )
> INSERT INTO #farm_animals2 VALUES ( 2, 'Goat', 'Sick' )
>
> SET NOCOUNT OFF
>
> -- Try and reproduce original spreadsheet to prove lossless decomposition
> SELECT f.farm_id, f.farm_name, fa.farm_animal, COUNT(*) AS animal_count
> FROM #farms f
> INNER JOIN #farm_animals2 fa ON f.farm_id = fa.farm_id
> GROUP BY f.farm_id, f.farm_name, fa.farm_animal
> ORDER BY f.farm_id, fa.farm_animal|||>> The most bothersome issue between the two alternatives is that they have
Very true. Stated simply, with limited or no knowledge of the underlying
business model, attempting to normalize some schema would be mostly
meaningless.
Anith|||
In your example you have created a column that is a multivalued fact about
the key and thus violates 2nd normal form.

No comments:

Post a Comment