mail@pastecode.io avatarunknown
16 days ago
1.9 kB
hi,can you write me an python code for below requirement.first there is an primary query you need to execute that using an pd_read_sql
primary_query=f"SELECT * FROM {schema_ppm}.etl_ppm_replication_master WHERE eprm_catalog='SC' AND eprm_enabled_flg='Y'"

after executing above one   iterate through the dataframe and get column 'eprm_table_name' which is there in etl_ppm_replication_master table now after getting that execute below source and target by placing table_name in the query which we got that from 'eprm_table_name'
source_query = f"SELECT * FROM {schema_source}.{table_name}"
source_df = pd.read_sql(source_query, connection_source)
target_query = f"SELECT * FROM {schema_ext}.{table_name}"  
target_df = pd.read_sql(target_query, connection_ext)

now in the same loop you have find the difference in source and target dataframes row by row starting you have to keep an track on that you can use some optimizations like instead of looping every row you can use some in built methods like compare after comparing if you find any differences in the row you have to take that in dataframe for example there are 30 rows in source and target in row 12 in source we have 1122 in column seq_nbr but same column in target has 2233 then that is considered as differenced like that you have check for every row and everycolumn for example i just you told you above after finding outthe difference you have to create an dynamic update statment and update the differnce record in the target.you have fetch all the column dynamically and construct them from the source and update them coming to the where condition it should like where pk='pk_value'
where pk = primary_df['eprm_table_col_pk']
pk_value=you have the value of the current row column pk like for exmaple if pk value is seq_nbr then from the source df you have to get seq_nbr column for the current row value