Wednesday, March 28, 2012

normalizing using sql commands

Hi

Please can someone point me in the direction, i built a very badly designed database consisting of only one huge table when i first started databases, since learning about normalization i have designed and set up a new database which consists of many more tables instead of just the one. My question is where do i start in transfering the data from the old single tabled database to my new multi-tabled database?

I have MS SQL server 2005 managment studio if that helps, but want to transfer around 200,000 rows of data into the new database. Both new and old databases are on the same server.

thanks in advancewith out some idea of the target and source schemas this is a broad question.

I would add a column to each of your tables for the purpose of the transfer that holds the primary key of your source table. This will allow you to establish your relationships while populating your tables. Remember to populate your parent tables before you populate your child tables so as not to violate your foriegn keys.

have fun.|||You will also find that the number of records is pretty well irrelevent - it is the number of tables in the newly normalised db that will more accurately determine your required effort.

If this is a one off then you need to work with copies of your databases and write one script at a time to normalise the data and populate the tables. As Sean says, start with the Parents. You just have to write a lot of scripts starting with "SELECT DISTINCT..." - no real shortcuts. Once you have written (and tested) all the required scripts you need to take your db off line, run them and then put your newly normalised db (you have changed the sproc interface\ application too yes?) on line.

To echo Sean - have fun :)|||hi guys

thanks for the advice! i'm really still at the starting blocks, how do i actually transfer the data?

I'm sure it will be a mixture of SELECT and INSERT statements, but can someone give me a simple example of selecting a tables data and inserting it into another using only sql?

cheers|||INSERT INTO OneTable (Col1, Col2)
SELECT ColA, ColB
FROM AnotherTable

Check Books Online and familiarise yourself with DML SQL.|||thanks pootle - i'm a noobie to booksonline, don't think i've used it before - i've just done a google search and downloaded the following:

link (http://www.microsoft.com/downloads/details.aspx?FamilyId=BE6A2C5D-00DF-4220-B133-29C1E0B6585F&displaylang=en)

Is this what you mean?|||AKA BoL BTW.

That's the chappy although that is the 2005 version. If you are using 2K then you can access it via Enterprise Manager:
Action-> Help
or Query Analyser:
Help-> Transact SQL Help

Read, absorb, learn - it is excellent.

A more gentle introduction to SQL:
http://www.w3schools.com/sql/default.asp|||Sorry, back again with another question!

I can easily move data around now using your simple example and BOL, how ever its a little more complicated as need to take newly created Foreign keys into account when transfering from old database (single table) to new database (multiple-tables). The following is the SQL i use to insert into my new database, i basically need to somehow get this to do the same when transfering data from my old database.

eg:
BEGIN TRANSACTION

DECLARE @.new_customer_id int

INSERT INTO CUSTOMER(name, email)
VALUES('Joe Bloggs', 'joe@.bloggs.com')

SET @.new_customer_id = @.@.IDENTITY

INSERT INTO VENUE(customer_id, business_name, address1)
VALUES(@.new_customer_id, 'bloggs corp', 'london')

COMMIT TRANSACTION

I'm guessing that i may need to use a WHILE loop (or is there an easier option?), but either i don't know how to use BOL properly or i can't seem to find any good examples on it!

thanks (again! ;) )|||Well - you could join your data using natural keys and propogate the surrogate (Identity) values as sets rather than looping. You defo want to avoid doing this in while loops - sets of data are the way to go.

For example - imagining that your surname\ forename combinations are unique (YOU NEED TO TEST THIS FIRST! - if they are not unique you need to either find a combination of fields that can serve as a natural key or eliminate the duplicates and handle those later on a case by case basis) and so will do as a natural key for the normalisation exercise...

Insert ALL your customers into the customers table in one query.

Link your newly populate customers table to the customers child data via surname and forename and include the identity value in your queries. You can now insert these to the child entities in one fell swoop rather than looping and inserting absolutes (like you are now).|||hi

Not sure if i confused things in the last post, i meant Primary not Foreign key.

Right, read your post about 10 times, think i've got it, or a similiar alternative!

Basically, i have a customers table and a venue table. Each venue has one customer, but one customer can have many venues, hence normalisation required me to split these into two seperate fileds.

I gave each customer a primary key, and also each venue a primary key.

My objective is to have the the customer id (auto incremented primary key from customers table) inserted/matched into the venues table so i will be able to tell which customer relates to which venue.

Link your newly populate customers table to the customers child data via surname and forename and include the identity value in your queries.

ok, so can copy over all the customer details to the customer table, and the venue details into the venue table, but I can insert the the venue_id field in both the customers and the venue table so they will have that in common.

This is my 'SQL think out loud i know its wrong' way of thinking...

INSERT INTO VENUE(customer_id)
SELECT customer_id FROM customer WHERE venue.venue_id = customer.venue_id

i'm thinking i need some type of join in there maybe?|||You want to check those SQL links again :) Your current route requires a UPDATE not an INSERT.

In case I didn't make much sense earlier I meant something along these lines:

INSERT INTO NewCustomers(ForeName, Surname, OtherCols)
SELECT DISTINCT ForeName, Surname, OtherCols FROM BigTable

INSERT INTO NewVenue(Venue_Name, Cusomer_ID, OtherCols)
SELECT DISTINCT BigTable.Venue_Name, NewCustomers.Customer_ID, BigTable.OtherCols
FROM BigTable INNER JOIN NewCustomers ON
NewCustomers.Surname = BigTable.Surname
AND NewCustomers .Forename = BigTable .Forename
To repeat - I am only flagging up surname and forename as possible join columns. You know your data - you may well have something much more appropriate.

HTH|||thanks flump, big hi-5 from the other side of the pennines! :D

No comments:

Post a Comment