• SQL Server 2000 problem

    Author
    Topic
    #414069

    I am trying to import data from a CSV file to a SQL server. The data imports fine, except when I run the import again, the data is recopied in to the table instead of the data in the table getting updated. The data in the CSV file will be updated every day and it may or may not have new or updated data in it. I cant figure out how to make it so the data on the server is the updated and most recent version of the CSV file.

    Viewing 1 reply thread
    Author
    Replies
    • #917771

      I take it you’re using a DTS package in conjunction with a SQL Job to perform the transfer. (If you’re not, you might consider doing it this way…)
      The default action for this import is to append to the destination table. You’ll need to add a task to the DTS package to delete the destination data before the import.

      To delete the data from the SQL table before importing the CSV file, simply add an “Execute SQL Task” task to the DTS designer. Enter the desired SQL statement (i.e. “DELETE FROM “). Make sure you have the appropriate SQL Connection selected, then click OK.

      To make sure this statement completes successfully BEFORE the import begins, select the newly created SQL task icon, then CTRL+Select the first step of the CSV import process (most likely a Text File Source icon). Right-click the CSV icon and select Workflow | On Success.

      For more information see the following SQL-Books-Online topics:
      -Execute SQL Task
      -Creating DTS Package Workflow and Tasks

      Hope this helps

      • #917775

        Mark,

        I am trying to do it with the DTS. The data in the CSV file with have the Last Name, First Name, Title, Phone Number, Extension, and Address. Each have their own column. The last name and first name need to be used as the primary key. If the name changes, then it will be a new row. However if the title, phone number, extension, or address change, I need to replace the old data with the updated data. I have been trying to do it by inserting the CSV file into a temp table and the searching the data in the temp table to find the new data, but I am not having any luck. Any help is appreciated.

        Thanks,
        Nick

        • #917781

          So it sounds like your CSV file does not contain the complete table each time – just changes and additions, right?

          Your approach sounds fine – import the CSV data to a temp table then process. You’ll probably want to enlist the help of a Stored Procedure for the processing. I would normally avoid using a Cursor in T-SQL when possible, but it seems that this task would be appropriate for a cursor. Here’s the pseudo-code:

          Declare a cursor with all fields
          Open the cursor
          Move to the first record
          While there are still records in the cursor
          Search for an existing record with the first/last names
          If it exists, update it with the new data
          Otherwise insert a new record
          Move to the next record
          End While
          Deallocate the cursor

          You’ll want to double-check the exact process in BooksOnline (see “DECLARE CURSOR”), but that should get the job done for you.

          Post back if you need any further assistance.

          • #917786

            How would you do it if you didnt use a cursor. Through some research, I have learned that a cursor is not the easiest way to go about things. Maybe I am wrong and I do need to use a cursor?

            • #917788

              I wouldn’t say that cursors aren’t the “easiest” but they’re certainly not the best in terms of performance. Running a cursor through a large table (100,000+ records) in a web application would cause some less-than-desired performance. However, using cursors for an import process (where extra seconds or even minutes won’t make a difference) is perfectly appropriate.

              Now that you mention it, you could also use some non-cursor code for inserting like this:

              INSERT
              SELECT * FROM
              RIGHT JOIN ON .FirstName = .FirstName AND .LastName = .LastName
              WHERE .FirstName IS NULL AND .LastName IS NULL

              However, you still need a way to iterate all (existing) records to perform an update. At the moment, a cursor is the best method that comes to mind.

              As an after-thought, you could also use a Trigger on the Temp table. Although, I try to avoid triggers since the logic is more difficult to track and maintain than stored procedures. (I’m a developer – not a DBA) smile

            • #917792

              If I would use a cursor, can you give me an example of the code used to declare the cursor? I see the pseudo-code listed above, but I am not certain how to declare the cursor. Would it be something like this:

              DECLARE temp_cursor CURSOR
              FOR SELECT * FROM temp
              OPEN temp_cursor
              FETCH NEXT FROM temp_cursor

              If this is correct, I am not sure where to go from here. Thanks again

            • #917807

              I use cursors so infrequently that I have to look at SQL BooksOnline for help each time! From the syntax you included, I can see that you found either the BOL help topic or some other resource that discusses this process. The only change you’ll need to make is adding a variable for each field – like this:

              DECLARE @FirstName varchar(50)
              DECLARE @LastName varchar(50)
              DECLARE @Title varchar(5)
              –etc…

              DECLARE temp_cursor CURSOR
              FOR SELECT * FROM temp

              OPEN temp_cursor

              FETCH NEXT FROM temp_cursor
              INTO @FirstName, @LastName, @Title –, etc…

              WHILE @@FETCH_STATUS = 0
              BEGIN
              –Declare variable for record count
              DECLARE @Count int
              –Get count of whether record exists in permanent table or not
              SELECT @Count = COUNT(*) FROM WHERE FirstName = @FirstName AND LastName = @LastName
              –If count > 0, record already exists
              IF @Count > 0 –Record already exists, perform update
              UPDATE
              SET—…..
              WHERE LastName = @LastName AND FirstName = @FirstName
              ELSE –Record does not exist, perform insert
              INSERT INTO
              (FirstName, LastName, Title, etc)
              VALUES (@FirstName, @LastName, @Title, etc)

              –Get next record
              FETCH NEXT FROM temp_cursor
              INTO @FirstName, @LastName, @Title –, etc…

              END

              CLOSE temp_cursor
              DEALLOCATE temp_cursor

              That’s basically it… You’ll need to complete the various statments and add the appropriate variables for the field names.

              Post back if that doesn’t do the trick.

            • #917819

              Stupid question…. Do you use this as a stored procedure or a local package?

            • #917824

              I would create a stored procedure, then execute the stored procedure from the DTS package – but that’s simply a suggestion. I find it easier to maintain the logic if it’s easy to find and reuse. If the code is entered directly into a SQL Task in the DTS package, you can’t reuse it for anything. But if it’s in a Stored Procedure it can be used by multiple DTS packages, etc.

            • #917826

              Here are my steps:

              Import the CSV file to the temp table
              Then parse through the data with the above code
              Then delete the temp table when I am done.

              Does this sound correct?
              Thanks

            • #917836

              That sounds like it will work. The only thing I would do differently is purge your temp table at the beginning rather than the end of the process. This will allow you to keep the temp records after the import in case you need to reference them for any reason.

              Also, you may consider adding a date/time stamp field to the import table, if one is not already there. This will allow you to determine when the record was updated.

              It looks like you’re on the right track! thumbup

            • #924336

              My name’s Chris, I am the engineer that is actually installing this SQL program – dubbs11 was doing the preliminary groundwork since I’m short on time.

              Everything seems to be working except for the most simple (I think) of issues, but I can’t find an answer. HOW DO I REFERENCE ONE DATABASE FROM THE OTHER?

              We have the DTS working properly – dropping and re-importing the CSV file each time. It’s coming into database TEMP, table cardemp. The database that it’s going into is WINPAK2, table cardholder. Here’s the code for cursoring and sorting:

              DECLARE @FirstName varchar(30)
              DECLARE @LastName varchar(30)
              DECLARE @Note1 varchar(64) –Title
              DECLARE @Note2 varchar(64) –JobCode
              DECLARE @Note3 varchar(64) –WorkArea
              DECLARE @Note4 varchar(64) –Department
              DECLARE @Note5 varchar(64) –HireDate
              DECLARE @Note6 varchar(64) –EmployeeID

              DECLARE Temp_Cursor cursor
              FOR SELECT * FROM [temp].[cardemp]

              OPEN Temp_Cursor

              FETCH NEXT FROM Temp_Cursor
              INTO @FirstName, @LastName, @Note1, @Note2, @Note3, @Note4, @Note5, @Note6

              WHILE @@FETCH_STATUS = 0
              BEGIN

              DECLARE @Count int

              SELECT @Count = Count(*) FROM [WINPAK2].[CardHolder] –FIRST ERROR HAPPENS HERE
              WHERE FirstName = @FirstName AND Note5 = @Note5 AND Note6 = @Note6

              IF @Count > 0
              UPDATE [WINPAK2].[CardHolder]
              SET LastName = @LastName, Note1 = @Note1, Note2 = @Note2, Note3 = @Note3, Note4 = @Note4
              WHERE FirstName = @FirstName AND Note5 = @Note5 AND Note6 = @Note6
              ELSE
              INSERT INTO [WINPAK2].[CardHolder] (FirstName, LastName, Note1, Note2, Note3, Note4, Note5, Note6)
              Values (@FirstName, @LastName, @Note1, @Note2, @Note3, @Note4, @Note5, @Note6)

              FETCH NEXT FROM Temp_Cursor
              INTO @FirstName, @LastName, @Note1, @Note2, @Note3, @Note4, @Note5, @Note6
              END

              CLOSE Temp_Cursor
              DEALLOCATE Temp_Cursor

              Running this code from the DTS gives errors for each reference of [WINPAK2].[CardHolder], but running it from a Stored Procedure in WINPAK2 gives an error for referencing [temp].[cardemp]. I can’t seem to find anywhere what the correct syntax is for referencing other database tables! Please Help.

            • #924343

              You’re close…

              To reference another database on the same sql server instance, use 3-part naming: [database].[owner].[ table ]
              (i.e. [WINPAK2].[dbo].[cardtemp]) — Please ignore the spaces surrounding “table”, the Lounge has a table tag that will cause this text to not display…
              To reference a database on another sql server instance, use 4-part naming: [sqlserver].[database].[owner].[ table ]
              (i.e. [SQLENT1].[Pubs].[dbo].[Authors])

              As a shortcut you can leave out the owner and use only dots – i.e. [WINPAK2]..[cardtemp]). This will use the default owner (i think).

              I’ve done a quick search in SQL Books Online and didn’t not find this in writing, but I’m sure it’s in there…

            • #924878

              Mark,

              That was the trick. Sorry I didn’t reply sooner, but I actually found an example that showed me the same answer, then I was offline testing and installing it. It works really sweet – one line at a time, updating or adding each record individually. They’re exporting the entire database each night across the network to a CSV file, then my DTS is running every morning at 1am to update the other database. There’s about 4,500 employees right now, so there’s a big time savings in typing!

            • #969449

              Before the import was ran one night, a relationship got changed in the compare portion of the local package. The csv file was imported as normal but the FirstName was imported as the LastName and the LastName was imported as the FirstName. I am trying to write some code to search through the database and find the records where the LastName = FirstName, First Name = LastName, Emp# = Emp#, and the date is null, and delete this record. Here is the code I have so far:

              DECLARE @Emp# varchar(64)
              DECLARE @Date varchar(8)
              DECLARE @LastName varchar(30)
              DECLARE @FirstName varchar(30)

              DECLARE Temp_Cursor cursor
              FOR SELECT * FROM [Test].[dbo].[TestTable]

              OPEN Temp_Cursor

              FETCH NEXT FROM Temp_Cursor
              INTO @Date, @LastName, @FirstName, @Emp#

              WHILE @@FETCH_STATUS = 0
              BEGIN

              DECLARE @Count int

              SELECT @Count = Count(*) FROM [Test].[dbo].[TestTable]
              WHERE Emp# = @Emp# And FirstName = @LastName AND LastName = @FirstName And Date = null

              IF @Count > 0
              DELETE [Test].[dbo].[TestTable]
              WHERE Emp# = @Emp# And FirstName = @LastName AND LastName = @FirstName And Date = null
              ELSE

              FETCH NEXT FROM Temp_Cursor
              INTO @Date, @LastName, @FirstName, @Emp#
              END

              CLOSE Temp_Cursor
              DEALLOCATE Temp_Cursor
              GO

              The file runs without any errors, but does not delete the duplicate entry I have in my table. Any help is appreciated.

              Thanks,
              Nick

            • #969482

              Hi,
              Try changing:
              Date = null
              to:
              [Date] IS NULL

              I wouldn’t recommend using Date as a column name, incidentally.

            • #917837

              That sounds like it will work. The only thing I would do differently is purge your temp table at the beginning rather than the end of the process. This will allow you to keep the temp records after the import in case you need to reference them for any reason.

              Also, you may consider adding a date/time stamp field to the import table, if one is not already there. This will allow you to determine when the record was updated.

              It looks like you’re on the right track! thumbup

            • #917827

              Here are my steps:

              Import the CSV file to the temp table
              Then parse through the data with the above code
              Then delete the temp table when I am done.

              Does this sound correct?
              Thanks

            • #917825

              I would create a stored procedure, then execute the stored procedure from the DTS package – but that’s simply a suggestion. I find it easier to maintain the logic if it’s easy to find and reuse. If the code is entered directly into a SQL Task in the DTS package, you can’t reuse it for anything. But if it’s in a Stored Procedure it can be used by multiple DTS packages, etc.

            • #917820

              Stupid question…. Do you use this as a stored procedure or a local package?

            • #917808

              I use cursors so infrequently that I have to look at SQL BooksOnline for help each time! From the syntax you included, I can see that you found either the BOL help topic or some other resource that discusses this process. The only change you’ll need to make is adding a variable for each field – like this:

              DECLARE @FirstName varchar(50)
              DECLARE @LastName varchar(50)
              DECLARE @Title varchar(5)
              –etc…

              DECLARE temp_cursor CURSOR
              FOR SELECT * FROM temp

              OPEN temp_cursor

              FETCH NEXT FROM temp_cursor
              INTO @FirstName, @LastName, @Title –, etc…

              WHILE @@FETCH_STATUS = 0
              BEGIN
              –Declare variable for record count
              DECLARE @Count int
              –Get count of whether record exists in permanent table or not
              SELECT @Count = COUNT(*) FROM WHERE FirstName = @FirstName AND LastName = @LastName
              –If count > 0, record already exists
              IF @Count > 0 –Record already exists, perform update
              UPDATE
              SET—…..
              WHERE LastName = @LastName AND FirstName = @FirstName
              ELSE –Record does not exist, perform insert
              INSERT INTO
              (FirstName, LastName, Title, etc)
              VALUES (@FirstName, @LastName, @Title, etc)

              –Get next record
              FETCH NEXT FROM temp_cursor
              INTO @FirstName, @LastName, @Title –, etc…

              END

              CLOSE temp_cursor
              DEALLOCATE temp_cursor

              That’s basically it… You’ll need to complete the various statments and add the appropriate variables for the field names.

              Post back if that doesn’t do the trick.

            • #917793

              If I would use a cursor, can you give me an example of the code used to declare the cursor? I see the pseudo-code listed above, but I am not certain how to declare the cursor. Would it be something like this:

              DECLARE temp_cursor CURSOR
              FOR SELECT * FROM temp
              OPEN temp_cursor
              FETCH NEXT FROM temp_cursor

              If this is correct, I am not sure where to go from here. Thanks again

            • #917789

              I wouldn’t say that cursors aren’t the “easiest” but they’re certainly not the best in terms of performance. Running a cursor through a large table (100,000+ records) in a web application would cause some less-than-desired performance. However, using cursors for an import process (where extra seconds or even minutes won’t make a difference) is perfectly appropriate.

              Now that you mention it, you could also use some non-cursor code for inserting like this:

              INSERT
              SELECT * FROM
              RIGHT JOIN ON .FirstName = .FirstName AND .LastName = .LastName
              WHERE .FirstName IS NULL AND .LastName IS NULL

              However, you still need a way to iterate all (existing) records to perform an update. At the moment, a cursor is the best method that comes to mind.

              As an after-thought, you could also use a Trigger on the Temp table. Although, I try to avoid triggers since the logic is more difficult to track and maintain than stored procedures. (I’m a developer – not a DBA) smile

          • #917787

            How would you do it if you didnt use a cursor. Through some research, I have learned that a cursor is not the easiest way to go about things. Maybe I am wrong and I do need to use a cursor?

        • #917782

          So it sounds like your CSV file does not contain the complete table each time – just changes and additions, right?

          Your approach sounds fine – import the CSV data to a temp table then process. You’ll probably want to enlist the help of a Stored Procedure for the processing. I would normally avoid using a Cursor in T-SQL when possible, but it seems that this task would be appropriate for a cursor. Here’s the pseudo-code:

          Declare a cursor with all fields
          Open the cursor
          Move to the first record
          While there are still records in the cursor
          Search for an existing record with the first/last names
          If it exists, update it with the new data
          Otherwise insert a new record
          Move to the next record
          End While
          Deallocate the cursor

          You’ll want to double-check the exact process in BooksOnline (see “DECLARE CURSOR”), but that should get the job done for you.

          Post back if you need any further assistance.

      • #917776

        Mark,

        I am trying to do it with the DTS. The data in the CSV file with have the Last Name, First Name, Title, Phone Number, Extension, and Address. Each have their own column. The last name and first name need to be used as the primary key. If the name changes, then it will be a new row. However if the title, phone number, extension, or address change, I need to replace the old data with the updated data. I have been trying to do it by inserting the CSV file into a temp table and the searching the data in the temp table to find the new data, but I am not having any luck. Any help is appreciated.

        Thanks,
        Nick

    • #917772

      I take it you’re using a DTS package in conjunction with a SQL Job to perform the transfer. (If you’re not, you might consider doing it this way…)
      The default action for this import is to append to the destination table. You’ll need to add a task to the DTS package to delete the destination data before the import.

      To delete the data from the SQL table before importing the CSV file, simply add an “Execute SQL Task” task to the DTS designer. Enter the desired SQL statement (i.e. “DELETE FROM “). Make sure you have the appropriate SQL Connection selected, then click OK.

      To make sure this statement completes successfully BEFORE the import begins, select the newly created SQL task icon, then CTRL+Select the first step of the CSV import process (most likely a Text File Source icon). Right-click the CSV icon and select Workflow | On Success.

      For more information see the following SQL-Books-Online topics:
      -Execute SQL Task
      -Creating DTS Package Workflow and Tasks

      Hope this helps

    Viewing 1 reply thread
    Reply To: SQL Server 2000 problem

    You can use BBCodes to format your content.
    Your account can't use all available BBCodes, they will be stripped before saving.

    Your information: