Monday, March 26, 2012

Normalization question

I was asked by a client how I would normalize some data that they had. An
example follows:
XIBMH002602382,XIBMH005,D:\005\00002382.TIF,
XIBMH002602383,XIBMH005,D:\005\00002383.TIF,
XIBMH002602384,XIBMH005,D:\005\00002384.TIF,
XIBMH002602385,XIBMH005,D:\005\00002385.TIF,
XIBMH002700001,XIBMH005,D:\005\00002386.TIF,Y
XIBMH002700002,XIBMH005,D:\005\00002387.TIF,
XIBMH002700003,XIBMH005,D:\005\00002388.TIF,
XIBMH002700004,XIBMH005,D:\005\00002389.TIF,
XIBMH002700005,XIBMH005,D:\005\00002390.TIF,
XIBMH002700006,XIBMH005,D:\005\00002391.TIF,
XIBMH002700007,XIBMH005,D:\005\00002392.TIF,
Where:
a. the first piece of data (e.g. XIBMH002602382) is the control
number identifying a scanned image.
b. the second piece of data (e.g. XIBMH005) is the volume name
for the CD the images are stored on.
c. the third piece of data (e.g. D:\005\00002382.TIF) is the
fully qualified filename of the image file.
d. the fourth piece of data (which is Y if it is present) says
whether this image is the first page of a multiple-page document.
Would you put the Volume name in another table with a keyed field and maybe
the Drive and Folder into another, or should the data be kept into one
table?
Just asking for your thoughts on what you would do with this data
ThanksI assume that the table is modeling a filename. The only concern I see is
if the directory name is functionally dependent on the volume name , it
would be better to have two tables:
file
===
control_number (pk)
--
volume (foreign key to volume table)
filename
first_page_of_multipage
volume
=====
volume (pk)
--
root directory
What isn't clear is the multipage thing:

> XIBMH002700001,XIBMH005,D:\005\00002386.TIF,Y
> XIBMH002700002,XIBMH005,D:\005\00002387.TIF,
>
Dos this mean there are multiple pages stored in the tif? If so, a better
way to do this would be to have an attribute of number_of_pages, but the yes
or no thing will work. If the pages are different files, then I would think
storing these filenames in the table would be better.
----
Louis Davidson - drsql@.hotmail.com
SQL Server MVP
Compass Technology Management - www.compass.net
Pro SQL Server 2000 Database Design -
http://www.apress.com/book/bookDisplay.html?bID=266
Note: Please reply to the newsgroups only unless you are interested in
consulting services. All other replies may be ignored :)
"Gary Paris" <garyparis@.yada.com> wrote in message
news:%23qmK7zlCFHA.4072@.TK2MSFTNGP10.phx.gbl...
>I was asked by a client how I would normalize some data that they had. An
>example follows:
> XIBMH002602382,XIBMH005,D:\005\00002382.TIF,
> XIBMH002602383,XIBMH005,D:\005\00002383.TIF,
> XIBMH002602384,XIBMH005,D:\005\00002384.TIF,
> XIBMH002602385,XIBMH005,D:\005\00002385.TIF,
> XIBMH002700001,XIBMH005,D:\005\00002386.TIF,Y
> XIBMH002700002,XIBMH005,D:\005\00002387.TIF,
> XIBMH002700003,XIBMH005,D:\005\00002388.TIF,
> XIBMH002700004,XIBMH005,D:\005\00002389.TIF,
> XIBMH002700005,XIBMH005,D:\005\00002390.TIF,
> XIBMH002700006,XIBMH005,D:\005\00002391.TIF,
> XIBMH002700007,XIBMH005,D:\005\00002392.TIF,
>
>
> Where:
> a. the first piece of data (e.g. XIBMH002602382) is the
> control number identifying a scanned image.
> b. the second piece of data (e.g. XIBMH005) is the volume name
> for the CD the images are stored on.
> c. the third piece of data (e.g. D:\005\00002382.TIF) is the
> fully qualified filename of the image file.
> d. the fourth piece of data (which is Y if it is present) says
> whether this image is the first page of a multiple-page document.
>
>
> Would you put the Volume name in another table with a keyed field and
> maybe the Drive and Folder into another, or should the data be kept into
> one table?
>
> Just asking for your thoughts on what you would do with this data
>
> Thanks
>
>|||>> I was asked by a client how I would normalize some data that they had.
If this data is in a table with three columns, I would suggest you ask your
client why he considers this data not normalized. You should have the clue
right then.
It all depends on how the business model distingushes one set of attributes
as a single fact from another. Do you have a need to update/insert/delete a
volume name without affecting the drive and folder information or vice
versa? Based on the consideration that a volume name determines the
directory, Louis' suggestion is a reasonable one.
Anithsql

No comments:

Post a Comment