Reputation: 4679
I have some text that's piped from another command that looks like so:
source:project_dbt.common.business_days
source:project_dbt.common.cms_compare
source:project_dbt.common.cms_provider_national
source:project_dbt.common.gov_cms_data_Medicare-Claims_Reassignment-Sub-File_rta9-bts3
source:project_dbt.common.gov_cms_data_Medicare-Enrollment_Address-Sub-File_je57-c47h
source:project_dbt.common.gov_cms_data_Medicare-Enrollment_Base-Provider-Enrollment-File_ykfi-ffzq
source:project_dbt.common.gov_cms_data_Medicare-Enrollment_Secondary-Specialty_n48j-8qtj
source:project_dbt.common.gov_cms_data_Medicare-Part-D_Medicare-Provider-Utilization-and-Payment-Data-Par_icvy-hptt
source:project_dbt.common.gov_cms_data_Medicare-Physician-Supplier_Medicare-Physician-and-Other-Supplier-National-Pro_5fr6-cch3
source:project_dbt.common.gov_cms_data_provider-data_dataset_mj5m-pzi6_National_Downloadable_File
source:project_dbt.common.gov_cms_download_nppes_NPPES_Data_Dissemination
source:project_dbt.common.gov_cms_openpaymentsdata_dataset_General-Payment-Data-Detailed-Dataset-2019-Reporti_qsys-b88w
source:project_dbt.common.gov_cms_openpaymentsdata_dataset_General-Payment-Data-Detailed-Dataset-2020-Reporti_txng-a8vj
source:project_dbt.common.medicare_provider_utilization_and_payment_part_b
source:project_dbt.common.medicare_provider_utilization_and_payment_part_d
source:project_dbt.common.medicare_snf_claims_quality_measures
source:project_dbt.common.medicare_snf_health_deficiencies
source:project_dbt.common.medicare_snf_mds_quality_measures
source:project_dbt.common.medicare_snf_provider_information
source:project_dbt.common.medicare_snf_quality_reporting_program_provider_data
source:project_dbt.common.offices_v3
source:project_dbt.common.open_payments_providers_supplementary
source:project_dbt.common.provider_per_zip_v6
source:project_dbt.common.providers_per_team_v1
source:project_dbt.common.tam_predictors_v1
source:project_dbt.common.tam_results_per_provider_all_v2
source:project_dbt.common.tam_results_per_provider_latest
source:project_dbt.common.tam_territory_components_v1
source:project_dbt.common.territories_per_zip_v2
source:project_dbt.common.territories_v2
source:project_dbt.common.usa_census_data_20201022
source:project_dbt.geo.simplemaps_uszips
source:project_dbt.geo.zip
And I want to modify it for input to a another command to run with xargs
.
I tried to implement this with perl
like so:
cat /tmp/myfile.txt \
| perl -pe 's{^(source:project_dbt\.)(.+)$}{mycompany_$2 sav_$2}m'
so that I can get the desired output (i.e. I just want to repeat the second capture group with 2 different prefixes):
mycompany_common.business_days sav_common.business_days
mycompany_common.cms_compare sav_common.cms_compare
mycompany_common.cms_provider_national sav_common.cms_provider_national
mycompany_common.sav_-File_rta9-bts3 sav_common.gov_cms_data_Medicare-sav_Claims_Reassignment-Sub-File_rta9-bts3
mycompany_common.sav_-File_je57-c47h sav_common.gov_cms_data_Medicare-sav_Enrollment_Address-Sub-File_je57-c47h
mycompany_common.sav_-Enrollment-File_ykfi-ffzq sav_common.gov_cms_data_Medicare-Enrollment_Base-sav_Provider-Enrollment-File_ykfi-ffzq
mycompany_common.sav_-8qtj sav_common.sav_gov_cms_data_Medicare-Enrollment_Secondary-Specialty_n48j-8qtj
mycompany_common.sav_-Provider-Utilization-and-Payment-Data-Par_icvy-hptt sav_common.gov_cms_data_Medicare-Part-D_Medicare-Provider-Utilization-and-sav_Payment-Data-Par_icvy-hptt
mycompany_common.sav_-Physician-and-Other-Supplier-National-Pro_5fr6-cch3 sav_common.gov_cms_data_Medicare-Physician-Supplier_Medicare-Physician-and-Other-sav_Supplier-National-Pro_5fr6-cch3
mycompany_common.sav_sav_common.gov_cms_data_provider-data_dataset_mj5m-pzi6_National_Downloadable_File
mycompany_common.gov_cms_download_nppes_NPPES_Data_Dissemination sav_common.gov_cms_download_nppes_NPPES_Data_Dissemination
mycompany_common.sav_-Detailed-Dataset-2019-Reporti_qsys-b88w sav_common.gov_cms_openpaymentsdata_dataset_General-Payment-Data-Detailed-sav_Dataset-2019-Reporti_qsys-b88w
mycompany_common.sav_-Detailed-Dataset-2020-Reporti_txng-a8vj sav_common.gov_cms_openpaymentsdata_dataset_General-Payment-Data-Detailed-sav_Dataset-2020-Reporti_txng-a8vj
mycompany_common.medicare_provider_utilization_and_payment_part_b sav_common.medicare_provider_utilization_and_payment_part_b
mycompany_common.medicare_provider_utilization_and_payment_part_d sav_common.medicare_provider_utilization_and_payment_part_d
mycompany_common.medicare_snf_claims_quality_measures sav_common.medicare_snf_claims_quality_measures
mycompany_common.medicare_snf_health_deficiencies sav_common.medicare_snf_health_deficiencies
mycompany_common.medicare_snf_mds_quality_measures sav_common.medicare_snf_mds_quality_measures
mycompany_common.medicare_snf_provider_information sav_common.medicare_snf_provider_information
mycompany_common.medicare_snf_quality_reporting_program_provider_data sav_common.medicare_snf_quality_reporting_program_provider_data
mycompany_common.offices_v3 sav_common.offices_v3
mycompany_common.open_payments_providers_supplementary sav_common.open_payments_providers_supplementary
mycompany_common.provider_per_zip_v6 sav_common.provider_per_zip_v6
mycompany_common.providers_per_team_v1 sav_common.providers_per_team_v1
mycompany_common.tam_predictors_v1 sav_common.tam_predictors_v1
mycompany_common.tam_results_per_provider_all_v2 sav_common.tam_results_per_provider_all_v2
mycompany_common.tam_results_per_provider_latest sav_common.tam_results_per_provider_latest
mycompany_common.tam_territory_components_v1 sav_common.tam_territory_components_v1
mycompany_common.territories_per_zip_v2 sav_common.territories_per_zip_v2
mycompany_common.territories_v2 sav_common.territories_v2
mycompany_common.usa_census_data_20201022 sav_common.usa_census_data_20201022
mycompany_geo.simplemaps_uszips sav_geo.simplemaps_uszips
mycompany_geo.zip sav_geo.zip
But when running this, the results are:
sav_common.business_daysays
sav_common.cms_compareare
sav_common.cms_provider_nationalnal
sav_common.gov_cms_data_Medicare-Claims_Reassignment-Sub-File_rta9-bts3ts3
sav_common.gov_cms_data_Medicare-Enrollment_Address-Sub-File_je57-c47h47h
sav_common.gov_cms_data_Medicare-Enrollment_Base-Provider-Enrollment-File_ykfi-ffzqfzq
sav_common.gov_cms_data_Medicare-Enrollment_Secondary-Specialty_n48j-8qtjqtj
sav_common.gov_cms_data_Medicare-Part-D_Medicare-Provider-Utilization-and-Payment-Data-Par_icvy-hpttptt
sav_common.gov_cms_data_Medicare-Physician-Supplier_Medicare-Physician-and-Other-Supplier-National-Pro_5fr6-cch3ch3
sav_common.gov_cms_data_provider-data_dataset_mj5m-pzi6_National_Downloadable_Fileile
sav_common.gov_cms_download_nppes_NPPES_Data_Disseminationion
sav_common.gov_cms_openpaymentsdata_dataset_General-Payment-Data-Detailed-Dataset-2019-Reporti_qsys-b88w88w
sav_common.gov_cms_openpaymentsdata_dataset_General-Payment-Data-Detailed-Dataset-2020-Reporti_txng-a8vj8vj
sav_common.medicare_provider_utilization_and_payment_part_bt_b
sav_common.medicare_provider_utilization_and_payment_part_dt_d
sav_common.medicare_snf_claims_quality_measuresres
sav_common.medicare_snf_health_deficienciesies
sav_common.medicare_snf_mds_quality_measuresres
sav_common.medicare_snf_provider_informationion
sav_common.medicare_snf_quality_reporting_program_provider_dataata
sav_common.offices_v3_v3
sav_common.open_payments_providers_supplementaryary
sav_common.provider_per_zip_v6_v6
sav_common.providers_per_team_v1_v1
sav_common.tam_predictors_v1_v1
sav_common.tam_results_per_provider_all_v2_v2
sav_common.tam_results_per_provider_latestest
sav_common.tam_territory_components_v1_v1
sav_common.territories_per_zip_v2_v2
sav_common.territories_v2_v2
sav_common.usa_census_data_20201022022
sav_geo.simplemaps_uszipsips
sav_geo.zipzip
I.e. the second capture group is not repeating, but instead the last 3 chars of each line are appended to each line...
I have tried a bunch of variations, I can't work out exactly what I am doing wrong.
Thanks in advance for any help!
Upvotes: 1
Views: 74
Reputation: 385657
Your passing a file with Windows line endings (CRLF) to a build of Perl expecting unix line endings (LF).
Given the input
source:project_dbt.common.business_days␍␊
\___________________/|
$2 Removed by -l
the program emits
mycompany_common.business_days␍ sav_common.business_days␍␊
\___________________/ \___________________/|
$2 $2 Added by -l
which your terminal displays as
sav_common.business_days_days
\_______________________/\___/
Overwritten From original
Convert the file to a unix file (e.g. using dos2unix
) or strip out trailing CR (s/\s+\z//;
).
Upvotes: 2