Reputation: 129
We downloaded .osm file from openstreetmaps gis data and converted it into .csv file through osmconvert.exe. The csv file is of 3.5 GB of size. We tried importing it to the database through heidisql. Also tried to import the file into database using below php script
$path = "../../indiacountry.csv";
$row = 0;
if (($handle = fopen($path, "r")) !== FALSE) {
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
$row++;
$data_entries[] = $data ;
}
fclose($handle);
}
// this you'll have to expand
foreach($data_entries as $line){
$ts++;
if ($ts>0)
{
$ft++;
if(mysql_query("insert into mbrace_resources.street_unit_number_india(id1) values ('".str_replace ("'","",$line [0])."')") or die("the eror ".mysql_error()));
}
// $db->execute($line);
}
When we first tried this script, there was memory_limit error and timeout. We changed memory_limit to 4000MB and set time limit to 0. Then tried the script again, the page was blank and continuously tried to execute the script, but not a single row got inserted into the table.
After going through all of this, we feel the only way forward was to split the csv file into multiple files.
How shall we do it.
Thanks in advance
Upvotes: 1
Views: 7826
Reputation: 94642
The script you show is reading the WHOLE .csv
file into an in memory array. Its not surprising it wont run that will require at least 3.5gig+ of memory.
Instead read one line from the file and apply it directly to the database.
I am going to ignore the fact you are using the old, dangerous and deprecated
mysql_
database extension for now. If you tell me you have access tomysqli_
orPDO
I will willingly rewrite this for either of those API's
$path = "../../indiacountry.csv";
$row = 0;
if (($handle = fopen($path, "r")) !== FALSE) {
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
$row++;
$id = str_replace ("'","",$line [0]);
mysql_query("insert into mbrace_resources.street_unit_number_india
(id1) values ('$id')")
or die("the eror ".mysql_error());
}
fclose($handle);
}
echo "Finished: Added $row rows";
Upvotes: 2
Reputation: 791
If you are looking for PHP specific solution; here is a simple one you can tweak for your needs. Well, this solution assumes that you do not need to duplicate the header row for each file. You can modify it accordingly to add header row for every part file if needed:
$outputFile = 'indiacountry-part-';
$splitSize = 50000; // 50k records in a one file
$in = fopen('indiacountry.csv', 'r');
$rows = 0;
$fileCount = 1;
$out = null;
while (!feof($in)) {
if (($rows % $splitSize) == 0) {
if ($rows > 0) {
fclose($out);
}
$fileCount++;
// for filenames like indiacountry-part-0001.csv, indiacountry-part-0002.csv etc
$fileCounterDisplay = sprintf("%04d", $fileCount);
$fileName = "$outputFile$fileCounterDisplay.csv";
$out = fopen($fileName, 'w');
}
$data = fgetcsv($in);
if ($data)
fputcsv($out, $data);
$rows++;
}
fclose($out);
Now you can programmatically parse every part files 'indiacountry-part-xxxx.csv' and insert it to your table in batches. Read each line and insert it as you read, not as a CLOB.
Upvotes: 3