I am having 2 database in my sql server with each database containing 1 single table as of now.
I have 2 database like below :
1) Db1(Mysql)
2) Db2(Oracle)
Now what i want to do is Fill my database table of sql server db1 with data from Db1 from Mysql like below :
Insert into Table1 select * from Table1
Select * from Table1(Mysql Db1) - Data coming from Mysql database
Insert into Table1(Sql server Db1) - Insert data coming from Mysql database considering same schema
I dont want to use sqlbulk copy as i dont want to insert chunk by chunk data.I want to insert all data in 1 go considering millions of data as my operation is just not limited to insert records in database.So user have to sit wait for a long like first millions of data inserting chunk by chunk in database and then again for my further operation which is also long running operation.
So if i have this process speed up then i can have my second operation also speed up considering all records are in my 1 local sql server instance.
Can anybody please show me if this is possible then how to achieve this in c# application?
Update: I researched about Linked server as @GorDon Linoff suggested me that linked server can be use to achieve this scenario but based on my research it seems like i cannot create linked server through code.
I want to do this with the help of ado.net.
This is what i am trying to do exactly :
Consider i have 2 different client RDBMS with 2 database and some tables in client premises.
So database is like this :
Sql Server : Db1 Order Id Amount 1 100 2 200 3 300 4 400 Mysql or Oracle : Db1: Order Id Amount 1 1000 2 2000 3 3000 4 400
Now i want to compare Amount column from source(SQl server) to destination database (Mysql or oracle).
Id will be use to join this 2 different RDBMS databases tables to compare Amount columns.
In c# what i can do is like fetch chunk by chunk records in my datatable(in memory) then compare this records with the help of code but this will take so much time considering millions of records.
So i want to do something better than this.
Hence i was thinking that i bring out this 2 RDBMS records in my local sql server instance in 2 databases and then create join query joining this 2 tables based on Id and then take advantage of DBMS processing capability which can compare this millions of records efficiently.
Query like this compares millions of records efficiently :
select SqlServer.Id,Mysql.Id,SqlServer.Amount,Mysql.Amount from SqlServerDb.dbo.Order as SqlServer Left join MysqlDb.dbo.Order as Mysql on SqlServer.Id=Mysql.Id where SqlServer.Amount != Mysql.Amount
Above query works when i have this 2 different RDBMS data in my local server instance with database : SqlServerDb and MysqlDb and this will fetch below records whose amount is not matching :
So I am trying to get those records from source(Sql server Db) to Mysql whose Amount column value is not matching.
Expected Output :
Id Amount 1 1000 2 2000 3 3000
So there is no way to achieve this scenario?
6 Answers
Answers 1
On the SELECT
side, create a .csv
file (tab-delimited) using SELECT ... INTO OUTFILE ...
On the INSERT
side, use LOAD DATA INFILE ...
(or whatever the target machine syntax is).
Doing it all at once may be easier to code than chunking, and may (or may not) be faster running.
Answers 2
What about adding a changed date in the remote database.
Then you could get all rows that have changed since the last sync and just compare those?
Answers 3
First of all do not use linked server. It is tempting but it will more trouble than it is bringing on the table. Like updates and inserts will fetch all of the target db to source db and do insert/update and post all data to target back.
As far as I understand you are trying to copy changed data to target system for some stuff.
I recommend using a timestamp column on source table. When anything changes on source table timestamp column is updated by sql server.
On target, get max ID and max timestamp. two queries at max.
On source, rows where source.ID <= target.MaxID && source.timestamp >= target.MaxTimeTamp
is true, are the rows that changed after last sync (need update). And rows where source.ID > target.MaxID
is true, are the rows that are inserted after last sync.
Now you do not have to compare two worlds, and you just got all updates and inserts.
Answers 4
You need to create a linked server connection using ODBC and the proper driver, after that you can execute the queries using openquery.
Take a look at openquery:
https://msdn.microsoft.com/en-us/library/ms188427(v=sql.120).aspx
Answers 5
SqlBulkCopy
can accept either a DataTable
or a System.Data.IDataReader
as its input.
Using your query to read the source DB, set up a ADO.Net DataReader
on the source MySQL or Oracle DB and pass the reader to the WriteToServer()
method of the SqlBulkCopy
.
This can copy almost any number of rows without limit. I have copied hundreds of millions of rows using the data reader approach.
Answers 6
Yes, SQL Server is very efficient when it's working with sets so let's keep that in play.
In a nutshell, what I'm pitching is
Load data from the source to a staging table on the target database (staging table = table to temporarily hold raw data from the source table, same structure as the source table... add tracking columns to taste). This will be done by your C# code... select from source_table into DataTable then SqlBulkCopy to the staging table.
Have a stored proc on the target database to reconcile the data between your target table and the staging table. Your C# code calls the stored proc.
Given that you're talking about millions of rows, another thing that can make things faster is dropping indices on the staging table before inserting to it and recreating those after the inserts and before any select is performed.
0 comments:
Post a Comment