Reputation: 129
Problem: My team is currently undergoing an ERP migration from an ECC system to a new S/4 Hana System. As part of go-live, our team needs to replicate all of the tables out of the S/4 system and into our SLT schema which will host the data. A majority of the tables will be handled by SLT replication out of SAP. But, due to tight timelines, we have identified 4 tables that would require multiple days of replication. The idea is to copy out the existing data from the remote source (ABAP/SDA) and place in our SLT schema. Once that is accomplished, we can activate the point forward replication and allow all new or modified records to be updated view SLT replication.
Approaches Attempted: Our current approach is to establish an SDA connection with the backend S/4 database and break the data down by year to insert into our local table using a stored procedure. There have been a number of issues that have arose with this approach, but it is currently working. It's just super slow.
Questions for the forum:
Example: Let's pretend we have a source table named: A_tbl
Then we will have our target table: B_tbl
Current Procedure:
CREATE OR REPLACE procedure LOAD_B_TBL_FROM_A_TBL ()
as
begin
declare v_offset_nbr integer;
declare v_record_count integer;
declare v_commit_count integer;
declare i integer;
declare v_year nvarchar(4);
declare v_record_per_commit_count CONSTANT INT = 1000000;
declare v_table_name CONSTANT NVARCHAR(30) = 'A_TBL';
declare v_start_year CONSTANT INT = 2011;
declare v_end_year CONSTANT INT = 2022;
declare year_nbr integer;
for year_nbr in v_start_year..v_end_year do
select IfNull(max(offset_nbr),0) into v_offset_nbr from B_TBL_SCHEMA.bulk_load_log where table_name = :v_table_name AND year_nbr = to_varchar(year_nbr); -- Get offset number of records
select count(*) into v_record_count from A_TBL_SCHEMAA_TBL A_TBL WHERE A_TBL.YEAR = to_varchar(year_nbr); -- Count the source records.
v_record_count = v_record_count - v_offset_nbr; -- Subtract out the records already committed for the current year. Failsafe if procedure fails
v_commit_count = v_record_count / v_record_per_commit_count; -- Number of times we need to loop
IF v_record_count < v_record_per_commit_count THEN -- Don't enter the loop if it's not necessary
INSERT INTO B_TBL_SCHEMA.B_TBL (
SELECT * FROM A_TBL_SCHEMAA_TBL
WHERE A_TBL.YEAR = to_varchar(year_nbr)
); -- Insert into our target table
COMMIT;
insert into B_TBL_SCHEMA.bulk_load_log values(
v_table_name,
to_varchar(year_nbr),
:v_offset_nbr,
now()
); -- Insert into a logging table to keep up with offset
ELSE
for i in 0..v_commit_count do -- Loop number of commit times. (500 million / 1 million) = 500 commits necessary to process entire table
INSERT INTO B_TBL_SCHEMA.B_TBL (
SELECT * FROM A_TBL_SCHEMAA_TBL
WHERE A_TBL.YEAR = to_varchar(year_nbr)
LIMIT :v_record_per_commit_count OFFSET :v_offset_nbr
); -- Insert into our target table
COMMIT;
v_offset_nbr = v_offset_nbr + v_record_per_commit_count; -- Update the offset before logging so we know where to begin if procedure fails
insert into B_TBL_SCHEMA.bulk_load_log values(
v_table_name,
to_varchar(year_nbr),
:v_offset_nbr,
now()
); -- Insert into logging table to keep up with offset
COMMIT;
end for;
end if;
end for;
end;
Upvotes: 0
Views: 1626
Reputation: 6751
I think the easiest and fastest way to transfer the tables without any additional administration is EXPORT
statement with BINARY
format option.
This can also be done via HANA studio context menu on schema or via File -> Export... -> SAP HANA -> Catalog Objects
, File -> Import...
.
With this approach you can manually set number of threads for export and import with no additional tricky code. After import in target system you'll have the same table with the same structure in the same schema as in the source system, so to move the table to new name or schema you need to copy it first in the source. After import you can insert ... select ...
into the target table or create a copy table with desired partitioning in the source system or repartition the imported table in the target and use it as the target table.
What is the advantages:
And finally, I've tested this in my system via HANA studio (import to local machine): table with 130M records and 57 columns with 5Gb size was exported in 8 threads within 6 minutes.
What about your original approach: you should always disable of drop indexes and constraints for really bulk operations and rebuild/enable them at the very end to save time for index rebuilding or constraints check during the insert.
Upvotes: 2
Reputation: 164
The performance bottleneck is probably the transfer over SDA that uses only one thread.
A workaround could be:
Here's some pseudo code.
-- virtual table to source A over SDA: VT_TBL_A
-- target table B: TBL_B
CREATE OR REPLACE FUNCTION F_TBL_A_YEAR ( IN YEAR INT)
RETURNS TABLE ( ... )
as begin
select * from VT_TBL_A where year(MY_DT_COLUMN) = :YEAR;
end;
DO
BEGIN
DECLARE t_var LIKE TBL_B;
--list of all 'partitions'
years=select distinct year(MY_DT_COLUMN) as MY_YEAR from VT_TBL_A;
--call the function for each year
t_var = MAP_MERGE(:years, F_TBL_A_YEAR( :years.MY_YEAR));
insert into TBL_B select * from :t_var;
commit;
END;
While this runs, I suggest that you check the system view M_REMOTE_STATEMENTS to confirm that records are transfered over multiple connections
Upvotes: 0