Reputation: 4375
I am trying to take a rather large CSV file and insert it into a MySQL database for referencing in a project. I would like to use the first line of the file to create the table using proper data types and not varchar for each column. The ultimate goal is to automate this process as I have several similar files but the each has different data and a different amount of "columns" in CSV files. The problem that I am having is gettype() is returning 'string' for each column instead of int, float and string as I would like it to.
Platform is PHP 5, OS is ubuntu 8.04
here is my code so far:
<?php
// GENERATE TABLE FROM FIRST LINE OF CSV FILE
$inputFile = 'file.csv';
$tableName = 'file_csv';
$fh = fopen($inputFile, 'r');
$contents = fread($fh, 5120); // 5KB
fclose($fh);
$fileLines = explode("\n", $contents); // explode to make sure we are only using the first line.
$fieldList = explode(',', $fileLines[0]); // separate columns, put into array
echo 'CREATE TABLE IF NOT EXISTS `'.$tableName.'` ('."<br/>\n";
for($i = 0; $i <= count($fieldList); $i++)
{
switch(gettype($fieldList[$i])) {
case 'integer':
$typeInfo = 'int(11)';
break;
case 'float':
$typeInfo = 'float';
break;
case 'string':
$typeInfo = 'varchar(80)';
break;
default:
$typeInfo = 'varchar(80)';
break;
}
if(gettype($fieldList[$i]) != NULL) echo "\t".'`'.$i.'` '.$typeInfo.' NOT NULL, --'.gettype($fieldList[$i]).' '.$fieldList[$i]."<br/>\n";
}
echo ' PRIMARY KEY (`0`)'."<br/>\n";
echo ') ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=1 ;';
Example First line: 1,0,0,0,0,0,0,0,0,0,0,0,0.000000,0.000000,0,0,0,,0,0,1,0,50,'Word of Recall (OLD)',
Upvotes: 2
Views: 7764
Reputation: 10886
Building on Nouveau's code you could do this
for($i = 0; $i <= count($fieldList); $i++)
{
if (is_numeric($fieldList[$i]))
{
if (strpos($fieldList[$i],'.') !== false){
$fieldList[$i] = (int)$fieldList[$i];
}else{
$fieldList[$i] = (float)$fieldList[$i];
}
}
switch(gettype($fieldList[$i])) {
case 'integer':
$typeInfo = 'int(11)';
break;
case 'float':
case 'double':
$typeInfo = 'float';
break;
case 'string':
$typeInfo = 'varchar(80)';
break;
default:
$typeInfo = 'varchar(80)';
break;
}
if(gettype($fieldList[$i]) != NULL) echo "\t".'`'.$i.'` '.$typeInfo.' NOT NULL, --'.gettype($fieldList[$i]).' '.$fieldList[$i]."<br/>\n";
}
That works, note the addition of "case 'double':" in the switch, but there may be a better way to do the int/float check as they would only work with standard uk/us numbers.
Upvotes: 4
Reputation: 4375
<?php
// GENERATE TABLE FROM FIRST LINE OF CSV FILE
$inputFile = 'file.csv';
$tableName = 'file_csv';
$fh = fopen($inputFile, 'r');
$contents = fread($fh, 5120); // 5KB
fclose($fh);
$fileLines = explode("\n", $contents);
$fieldList = explode(',', $fileLines[0]);
echo 'CREATE TABLE IF NOT EXISTS `'.$tableName.'` ('."<br/>\n";
for($i = 0; $i <= count($fieldList); $i++)
{
if(strlen($fieldList[$i]) == 0) $typeInfo = 'varchar(80)';
if(preg_match('/[0-9]/', $fieldList[$i])) $typeInfo = 'int(11)';
if(preg_match('/[\.]/', $fieldList[$i])) $typeInfo = 'float';
if(preg_match('/[a-z\\\']/i', $fieldList[$i])) $typeInfo = 'varchar(80)';
echo "\t".'`'.$i.'` '.$typeInfo.' NOT NULL, -- '.gettype($fieldList[$i]).' '.$fieldList[$i]."<br/>\n";
}
echo ' PRIMARY KEY (`0`)'."<br/>\n";
echo ') ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=1 ;';
Upvotes: 2
Reputation: 20950
Using regular expressions will give you a flexible solution. If you need to detect date fields then this is the way to go.
Upvotes: 1
Reputation: 10420
Try casting the value and comparing it with the original one:
define('DECIMAL_SEPARATOR', '.');
switch ($fieldList[$i])
{
case (string)(int)$fieldList[$i]:
$typeInfo = (strpos($fieldList[$i], DECIMAL_SEPARATOR) === false) ? 'int(11)' : 'float';
break;
case (string)(float)$fieldList[$i]:
$typeInfo = 'float';
break;
default:
$typeInfo = 'varchar(80)';
break;
}
Additionaly, check for the presence of decimal separator in the first case for numbers that are round, yet they have the decimal fraction part.
Upvotes: 2