Reputation: 451
Im using awk and sed to get a list of partitions in table with their size, which I want to use to calculate the daily increment for individual tables. This is the output I have, now Im struggling to convert all of the sizes to MBs.
What would be the best bash way to match the number in the second field and multiply it depending on the "MB" or "GB" string
2017061505,482.46MB,hdfs://user/hive/warehouse/cz_prd_ntw_op.db/diameter__24_/pr_comp_ver=0/pr_start_time=2017061505,
2017061505,722.58MB,hdfs://user/hive/warehouse/cz_prd_ntw_op.db/diameter__24_/pr_comp_ver=0/pr_start_time=2017061506,
2017061507,1.03GB,hdfs://user/hive/warehouse/cz_prd_ntw_op.db/diameter__24_/pr_comp_ver=0/pr_start_time=2017061507,
2017061507,1.25GB,hdfs://user/hive/warehouse/cz_prd_ntw_op.db/diameter__24_/pr_comp_ver=0/pr_start_time=2017061508,
The desired output would be:
2017061505,482.46MB,hdfs://MORPHEUS/user/hive/warehouse/cz_prd_ntw_op.db/diameter__24_/pr_comp_ver=0/pr_start_time=2017061505,
2017061506,722.58MB,hdfs://MORPHEUS/user/hive/warehouse/cz_prd_ntw_op.db/diameter__24_/pr_comp_ver=0/pr_start_time=2017061506,
2017061507,1030MB,hdfs://MORPHEUS/user/hive/warehouse/cz_prd_ntw_op.db/diameter__24_/pr_comp_ver=0/pr_start_time=2017061507,
2017061508,1250MB,hdfs://MORPHEUS/user/hive/warehouse/cz_prd_ntw_op.db/diameter__24_/pr_comp_ver=0/pr_start_time=2017061508,
Upvotes: 4
Views: 7775
Reputation: 236
Simple solution in awk:
awk '$2 ~ /[0-9\.]+GB/ { $2 = int($2 * 1024) "MB" } 1' FS="," OFS="," table.txt
Feel free to add another rule for kB conversion (just divide by 1024).
Upvotes: 4
Reputation: 92904
awk solution:
awk 'BEGIN{ FS=OFS="," }{ s=substr($2,1,length($2)-1); u=substr($2,length($2)-1);
if(u=="KB") $2=(s/1024)"MB"; else if(u=="GB") $2=(s*1024)"MB" }1' yourfile
The output:
2017061505,482.46MB,hdfs://user/hive/warehouse/cz_prd_ntw_op.db/diameter__24_/pr_comp_ver=0/pr_start_time=2017061505,
2017061505,722.58MB,hdfs://user/hive/warehouse/cz_prd_ntw_op.db/diameter__24_/pr_comp_ver=0/pr_start_time=2017061506,
2017061507,1054.72MB,hdfs://user/hive/warehouse/cz_prd_ntw_op.db/diameter__24_/pr_comp_ver=0/pr_start_time=2017061507,
2017061507,1280MB,hdfs://user/hive/warehouse/cz_prd_ntw_op.db/diameter__24_/pr_comp_ver=0/pr_start_time=2017061508,
Note, the nominal unit value in informatics is 1024
Upvotes: 1