Mike West
Mike West

Reputation: 35

How to remove quotes when importing specific columns from a CSV into mysql

I'm trying to import the contents of over 100 CSV files (but for simplicity I'll only ask about a single one for this question).

The CSVs are from a geolocation database, mapping UK Postcodes to map grid references, lat/long coordinates, and a ton of other data that I don't need. e.g.:

pcd,pcd2,pcds,dointr,doterm,oscty,ced,oslaua,osward,parish,usertype,oseast1m,osnrth1m,osgrdind,oshlthau,nhser,ctry,rgn,streg,pcon,eer,teclec,ttwa,pct,itl,statsward,oa01,casward,npark,lsoa01,msoa01,ur01ind,oac01,oa11,lsoa11,msoa11,wz11,sicbl,bua11,buasd11,ru11ind,oac11,lat,long,lep1,lep2,pfa,imd,calncv,icb,oa21,lsoa21,msoa21
"AB1 0AA","AB1  0AA","AB1 0AA","198001","199606","S99999999","S99999999","S12000033","S13002843","S99999999","0","385386","0801193","1","S08000020","S99999999","S92000003","S99999999","0","S14000002","S15000001","S09000001","S22000047","S03000012","S30000026","99ZZ00","S00001364","01C30","S99999999","S01000011","S02000007","6","3C2","S00090303","S01006514","S02001237","S34002990","S03000012","S99999999","S99999999","3","1C3",57.101474,-2.242851,"S99999999","","S23000009",6715,"S99999999","S99999999","","",""
"AB1 0AB","AB1  0AB","AB1 0AB","198001","199606","S99999999","S99999999","S12000033","S13002843","S99999999","0","385177","0801314","1","S08000020","S99999999","S92000003","S99999999","0","S14000002","S15000001","S09000001","S22000047","S03000012","S30000026","99ZZ00","S00001270","01C31","S99999999","S01000011","S02000007","6","4B3","S00090303","S01006514","S02001237","S34002990","S03000012","S99999999","S99999999","3","1C3",57.102554,-2.246308,"S99999999","","S23000009",6715,"S99999999","S99999999","","",""
"AB1 0AD","AB1  0AD","AB1 0AD","198001","199606","S99999999","S99999999","S12000033","S13002843","S99999999","0","385053","0801092","1","S08000020","S99999999","S92000003","S99999999","0","S14000002","S15000001","S09000001","S22000047","S03000012","S30000026","99ZZ00","S00001364","01C30","S99999999","S01000011","S02000007","6","3C2","S00090399","S01006514","S02001237","S34003015","S03000012","S99999999","S99999999","3","6A1",57.100556,-2.248342,"S99999999","","S23000009",6715,"S99999999","S99999999","","",""

I only need the pcds, osnrth1m, oseast1m, lat & lng columns, so I'm using this query:

LOAD DATA INFILE 'C:/xampp/htdocs/postcode_updates/csv/ONSPD_MAY_2024_UK_AB.csv'
INTO TABLE geopostcodes
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
IGNORE 1 LINES 
(@pcds, @osnrth1m, @oseast1m, @lat, @lng)
SET pcode = @pcds, grid_n = @osnrth1m, grid_e = @oseast1m, lat = @lat, lng = @lng

It pulls in the right data, but the postcodes are added with the quotes, and as the grid reference rows in my db are expecting integers, and the lat/long as decimals, the quotes mean they are imported as zero values.

I've tried various versions of trim and string replace functions, but can't find one that works without breaking the query. How do I remove the quotes?

Upvotes: 1

Views: 34

Answers (1)

Mike West
Mike West

Reputation: 35

Thanks to @Barmar - their suggestion removed the quotes:

Add OPTIONALLY ENCLOSED BY '"' to the FIELDS clause.

I then realised that my code mapping CSV columns to MYSQL columns didn't work either!

Some studying of the MYSQL manual led me to this somewhat clunky-looking, but working query:

LOAD DATA INFILE 'C:/xampp/htdocs/postcode_updates/csv/ONSPD_MAY_2024_UK_SA.csv' 
INTO TABLE geopostcodes 
FIELDS TERMINATED BY ',' 
OPTIONALLY ENCLOSED BY '"' 
LINES TERMINATED BY '\n' 
IGNORE 1 LINES 
(@dummy, @dummy, pcode, @dummy, @dummy, @dummy, @dummy, @dummy, @dummy, @dummy,
@dummy, grid_n, grid_e, @dummy, @dummy, @dummy, @dummy, @dummy, @dummy, @dummy,
@dummy, @dummy, @dummy, @dummy, @dummy, @dummy, @dummy, @dummy, @dummy, @dummy,
@dummy, @dummy, @dummy, @dummy, @dummy, @dummy, @dummy, @dummy, @dummy, @dummy,
@dummy, @dummy, lat, lng)

Upvotes: 0

Related Questions