This will take a CSV file, create a table, and insert the data. The file needs to have a header row indicating the field names. Once it's created it uses a simple load data infile call.
This is a command line script, usage:
php csv_import.php csv_file.csv table_name
The table_name parameter is optional. The name of the csv_file will be used if it is excluded.
<?php
$host = 'localhost';
$user = 'root';
$pass = '';
$database = 'database';
$db = mysql_connect($host, $user, $pass);
mysql_query("use $database", $db);
/********************************************************************************/
// Parameters: filename.csv table_name
$argv = $_SERVER[argv];
if($argv[1]) { $file = $argv[1]; }
else {
echo "Please provide a file name\n"; exit;
}
if($argv[2]) { $table = $argv[2]; }
else {
$table = pathinfo($file);
$table = $table['filename'];
}
/********************************************************************************/
// Get the first row to create the column headings
$fp = fopen($file, 'r');
$frow = fgetcsv($fp);
foreach($frow as $column) {
if($columns) $columns .= ', ';
$columns .= "`$column` varchar(250)";
}
$create = "create table if not exists $table ($columns);";
mysql_query($create, $db);
/********************************************************************************/
// Import the data into the newly created table.
$file = $_SERVER['PWD'].'/'.$file;
$q = "load data infile '$file' into table $table fields terminated by ',' ignore 1 lines";
mysql_query($q, $db);
?>
Hi,
I keep getting error on "Please provide a file name\n"; exit; "
Does the file name that user upload has to be "filename.csv" Also I am getting Null for $argv
I am first uploading the file to server which is loading fine, and then running this.
if(empty($errors)==true){ // start upload
$host = 'localhost';
$user = 'root';
$pass = 'root';
$database = 'db_name';
$db = mysql_connect($host, $user, $pass);
mysql_query("use $database", $db);
/********************************************************************************/
// Parameters: filename.csv table_name
$argv = $_SERVER[argv];
if($argv[1]) { $file = $argv[1]; }
else {
echo "Please provide a file name\n"; exit;
}
if($argv[2]) { $table = $argv[2]; }
else {
$table = pathinfo($file);
$table = $table['filename'];
}
/********************************************************************************/
// Get the first row to create the column headings
$fp = fopen($file, 'r');
$frow = fgetcsv($fp);
foreach($frow as $column) {
if($columns) $columns .= ', ';
$columns .= "`$column` varchar(250)";
}
$create = "create table if not exists $table ($columns);";
mysql_query($create, $db);
/********************************************************************************/
// Import the data into the newly created table.
$file = $_SERVER['PWD'].'/'.$file;
$q = "load data infile '$file' into table $table fields terminated by ',' ignore 1 lines";
mysql_query($q, $db);
} // end upload
Thanks for all your help in advance.
Hi Hawkee, I want to grab a csv from a geo server once a week and load it into a mySQL instance on AWS.
Would your script do this? I've tried to achieve this directly from mySQL with a LOAD DATA INFILE and always get security errors. Even though I can enter the location of the csv file into the browser url and download it fine.
This will create the table the first time it runs based on the first row as the field names. After that it should continue to work as long as the file format does change. If you run it more than once on the same file it'll create duplicates. Another thing to note is that the field names will all be varchar(250), so you may want to go back and change those to more meaningful types.
using PHP isn't beyond my abilities at all but I'm concerned at using minimum methods to move the data around. I've got data generated by Server A that I want to get to the AWS mySQL instance on Server C. I can't run script on Server A, although I do have a web server (Server B) that can run PHP. It seems to me a better solution (less links, less PHP, more secure) if I get Server C to Load the CSV directly from Server A then tidy it up.
Besides, I've already developed some PHP script on Server B to do this and it keeps timing out at the 2 minute mark. I realise I could probably increase the timeout limit but what if the CSV file I'm working with one day is so big that increasing a timeout value no longer works?
Cheers,
John.
I'll just add to this as I am enjoying this conversation and others may benefit.
It turns out that I can't use the LOAD INFILE statement on mySQL where the CSV/data file is addressed with a URL. also, to make matters more painful I can't insert a LOAD statement into a mySQL stored procedure - which is what I would want to do to automate the process.
So it looks like I DO need to re-consider PHP after all.
Its the time-outs on PHP that concern me the most. It would be good if I could chunk the file but that really does look complicated. besides, the first file I'm testing with is not that big - around 650k.
Are you running the script on Server B through your web browser? This sort of script should be run from the command line as web browsers aren't really designed to handle processing tasks like this. Ideally you'll upload the CSV with a cron script running every so often looking for a new file. If one exists then the import can proceed.
I create the php file which resides on the web server (in a separate folder) then execute it from the URL.
This has been a great way to expand my understanding of PHP (and its interaction with mySQL) as the host environment supports all of these.
Are you saying the job might perform better if I were to schedule it with CRON?
OK. I've used PUTTY to access my shared host folder via SSH. I have two filesets to upload. The first contains a list of fish species and is approx 5,000 rows. This completed reasonably quickly. That's good news! it was timing out before when run from the URL of the browser. (Thanks Mr Hawkee!).
The next file has 8 columns and 250,000 rows. its progressing at roughly 5,000 rows every 10 minutes so i figure it will take 8 hours to complete. When I used a LOAD INLINE command from mySQL Workbench this same file transfer completed in 10 minutes. So I know it can go quicker.
Any thoughts on how to speed this part up?
Thanks again,
John.
not a select statement as such. but the process does work line by line which means its writing line by line to the database. I understand that block processing would be faster.
here is a subset of the code. You can see the top section grabs the current row of the CSV , then below it INSERTs.
if (($handle = fopen($csv_file, "r")) !== FALSE) {
fgetcsv($handle);
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
$num = count($data);
for ($c=0; $c < $num; $c++) {
$col[$c] = $data[$c];
}
$col1 = $col[1]; //SpeciesID
$col2 = $col[2]; //SurveyID
$col3 = str_replace("'","",$col[3]); //Longitude
$col4 = str_replace("'","",$col[4]); //Latitude
$col5 = str_replace("'","",$col[5]); //Realm
$col6 = str_replace("'","",$col[6]); //Country
$col7 = str_replace("'","",$col[7]); //State
$col8 = str_replace("'","",$col[8]); //Geometry
/ $col9 = str_replace("'","",$col[9]); //frequency
$col10 = str_replace("'","",$col[10]); //abundance
$col11 = str_replace("'","",$col[11]); //commonfamilyname
$col12 = str_replace("'","",$col[12]); //commonclassname
$col13 = str_replace("'","",$col[13]); //commonphylumname
$col14 = $col[14]; //lmax
/
// SQL Query to insert data into DataBase
$sql = "INSERT INTO locations(speciesid,locationid,longitude,latitude,realm,country,state) VALUES(".$col1.",'".$col2.",'".$col3."','".$col4."','".$col5."','".$col6."','".$col7."','".$col8."')";
Oh wow. I couldn't get the LOAD DATA LOCAL INFILE to work when running in the mySQL Workbench environment because the file being imported was addressed in a URL. But it never occurred to me to run this LOAD from the host server where the CSV file is residing.
results: 250,000 row CSV estimated to take 8 hours to load took 15 seconds!
Nice one.... Now I just need to get the CSV files from the geo-server via a URL to my host server where this new script can pick them up. Then CRON it and I'm away! :)
Thanks for the tip.
OK. While North America sleeps the real work gets done! {joke}.
i was able to use file_get_contents to save a local CSV file from a geo server generated file.
Everything is now running in a series of php scripts that are included in a single "master" script and all I have to do is schedule this "master" from cron.
The only thing that concerns me is error handling. I've got some try/catches, but admit I probably should build more in.
Anyway, it's looking very good. All thanks to you!
cheers.
John.
The issue I hit with this script is that it assumes all data in your CSV will fit into a varchar field, limited to 255 characters. But some of my data needs to go into TEXT fields with unlimited length. Unfortunately there's no way for the script to know in advance what kinds of fields to create in the db. And because the final line of the script, which does the loading, does not report errors, it fails silently (very mysterious).
Thanks!
PHP Notice: Use of undefined constant argv - assumed 'argv' in import.php on line 14
PHP Notice: Undefined variable: columns in import.php on line 35
PHP Notice: Undefined variable: columns in import.php on line 36
to remove the last two warnings change the foreach for:
foreach($frow as $column) {
$ccount++;
if(isset($columns)) $columns .= ', ';
else $columns="";
$columns .= "$column bigint(20)";
}
Can't the table name be the name of the csv file? or a temp file in the code? I have seen it done before but can't find the instance in which I had done this. I will look again harder to find it.
Ultimately I have a csv file that a user will upload then I can run queries on it. There are many different csv files slightly different in format, so I would then have to create a script to tailor each scenario? I think that means a new table for each file template?
How do I add a browse button for this?
Lets say I need to select the file can I use something like:
//get the csv file
$file = $_FILES[csv][tmp_name];
$handle = fopen($file,"r");
With a form like this:
Choose your file:I tried using the script as is and I always get the error:
'Please provide a file name'
I would really like to get this working so any additional information would be greatly appreciated
Well the script kind'a makes few gltiches on windoze... To fix it
remove notices by:
line 14 replace$_SERVER[argv];
with $_SERVER['argv'];
After line 30 $frow = fgetcsv($fp);
Add $columns=false;
And line 42 now fix path (Windows only though) like this
$file = addslashes(realpath(dirname(__FILE__)).'\\'.$file);
For Linodoze just use / instead of \ ...
THANKS!
I made some modification to improve your code, this is my version
<?php
/**
* http://hawkee.com/snippet/8320/
*
* Installation
* ------------
* People who are using mysql version 5 and above.
*
* Before loading the file into mysql must ensure that below tow line are added in side etc/mysql/my.cnf
*
* to edit my.cnf command is
* sudo vi /etc/mysql/my.cnf
*
* [mysqld]
* local-infile
*
* [mysql]
* local-infile
*
* Example manual run from console
* mysql -u root -proot
* mysql use base;
* load data local infile '/home/gustavo/www/maps-importer/hotel_all_active_09-10-14.csv' into table hotel_data fields terminated by ';' ignore 1 lines;
*/
error_reporting(E_ALL);
$host = 'localhost';
$user = 'root';
$pass = 'root';
$database = 'mapshotels';
$delimiter = ';'; //By default ,
$db = mysqli_connect($host, $user, $pass,$database);
if (!$db) {
echo "Error: Unable to connect to MySQL." . PHP_EOL;
echo "Debugging errno: " . mysqli_connect_errno() . PHP_EOL;
echo "Debugging error: " . mysqli_connect_error() . PHP_EOL;
exit;
}
/********************************************************************************/
// Parameters: filename.csv table_name
$argv = $_SERVER['argv'];
if($argv[1]) { $file = $argv[1]; }
else {
echo "Please provide a file name\n"; exit;
}
if($argv[2]) { $table = $argv[2]; }
else {
$table = pathinfo($file);
$table = $table['filename'];
}
/********************************************************************************/
// Get the first row to create the column headings
$fp = fopen($file, 'r');
$frow = fgetcsv($fp,0,$delimiter);
$columns = '';
foreach($frow as $column) {
if($columns) $columns .= ', ';
$columns .= "`$column` varchar(191)"; //Max varchar for utf8mb4!!
}
$create = "create table if not exists $table ($columns);";
//Debug
echo $create."\n";
mysqli_query($db,$create);
/********************************************************************************/
// Import the data into the newly created table.
$file = $_SERVER['PWD'].DIRECTORY_SEPARATOR.$file;
$q = "load data local infile '$file' into table $table fields terminated by '$delimiter' ignore 1 lines";
//Debug
echo $q."\n";
mysqli_query($db,$q);
Are you running this from your browser or the command line? This is a command line script, so the file name comes in as a second argument.