CSV to mySQL - Create Table and Insert Data

By Hawkee on Jan 19, 2011

This will take a CSV file, create a table, and insert the data. The file needs to have a header row indicating the field names. Once it's created it uses a simple load data infile call.

This is a command line script, usage:

php csv_import.php csv_file.csv table_name

The table_name parameter is optional. The name of the csv_file will be used if it is excluded.

<?php

$host = 'localhost';
$user = 'root';
$pass = '';
$database = 'database';

$db = mysql_connect($host, $user, $pass);
mysql_query("use $database", $db);

/********************************************************************************/
// Parameters: filename.csv table_name

$argv = $_SERVER[argv];

if($argv[1]) { $file = $argv[1]; }
else {
    echo "Please provide a file name\n"; exit; 
}
if($argv[2]) { $table = $argv[2]; }
else {
    $table = pathinfo($file);
    $table = $table['filename'];
}

/********************************************************************************/
// Get the first row to create the column headings

$fp = fopen($file, 'r');
$frow = fgetcsv($fp);

foreach($frow as $column) {
    if($columns) $columns .= ', ';
    $columns .= "`$column` varchar(250)";
}

$create = "create table if not exists $table ($columns);";
mysql_query($create, $db);

/********************************************************************************/
// Import the data into the newly created table.

$file = $_SERVER['PWD'].'/'.$file;
$q = "load data infile '$file' into table $table fields terminated by ',' ignore 1 lines";
mysql_query($q, $db);

?>

Comments

Sign in to comment.
gzahid   -  Nov 04, 2015

Hi,

I keep getting error on "Please provide a file name\n"; exit; "

Does the file name that user upload has to be "filename.csv" Also I am getting Null for $argv

I am first uploading the file to server which is loading fine, and then running this.

if(empty($errors)==true){ // start upload

            $host = 'localhost';
            $user = 'root';
            $pass = 'root';
            $database = 'db_name';

            $db = mysql_connect($host, $user, $pass);
            mysql_query("use $database", $db);

            /********************************************************************************/
            // Parameters: filename.csv table_name

            $argv = $_SERVER[argv];

            if($argv[1]) { $file = $argv[1]; }
            else {
                echo "Please provide a file name\n"; exit; 
            }
            if($argv[2]) { $table = $argv[2]; }
            else {
                $table = pathinfo($file);
                $table = $table['filename'];
            }

            /********************************************************************************/
            // Get the first row to create the column headings

            $fp = fopen($file, 'r');
            $frow = fgetcsv($fp);

            foreach($frow as $column) {
                if($columns) $columns .= ', ';
                $columns .= "`$column` varchar(250)";
            }

            $create = "create table if not exists $table ($columns);";
            mysql_query($create, $db);

            /********************************************************************************/
            // Import the data into the newly created table.

            $file = $_SERVER['PWD'].'/'.$file;
            $q = "load data infile '$file' into table $table fields terminated by ',' ignore 1 lines";
            mysql_query($q, $db);

        } // end upload

Thanks for all your help in advance.

Hawkee  -  Nov 05, 2015

Are you running this from your browser or the command line? This is a command line script, so the file name comes in as a second argument.

gzahid  -  Nov 08, 2015

I am running it from browser not command line.

Thanks for responding quickly.

Sign in to comment

trustedsamurai   -  Oct 21, 2015

Hi Hawkee, I want to grab a csv from a geo server once a week and load it into a mySQL instance on AWS.

Would your script do this? I've tried to achieve this directly from mySQL with a LOAD DATA INFILE and always get security errors. Even though I can enter the location of the csv file into the browser url and download it fine.

Hawkee  -  Oct 21, 2015

This will create the table the first time it runs based on the first row as the field names. After that it should continue to work as long as the file format does change. If you run it more than once on the same file it'll create duplicates. Another thing to note is that the field names will all be varchar(250), so you may want to go back and change those to more meaningful types.

trustedsamurai  -  Oct 21, 2015

Thanks for writing back so promptly. I'm wondering if this is the best way to do this or whether I should try and use LOAD INLINE FILE from mySQL to bring the data in directly - without PHP.

Hawkee  -  Oct 21, 2015

Ideally you'd want to create a custom script that handles your particular format and generates insert statements, but if that's beyond your scope you could look for mySQL import software solutions.

trustedsamurai  -  Oct 21, 2015

using PHP isn't beyond my abilities at all but I'm concerned at using minimum methods to move the data around. I've got data generated by Server A that I want to get to the AWS mySQL instance on Server C. I can't run script on Server A, although I do have a web server (Server B) that can run PHP. It seems to me a better solution (less links, less PHP, more secure) if I get Server C to Load the CSV directly from Server A then tidy it up.

Besides, I've already developed some PHP script on Server B to do this and it keeps timing out at the 2 minute mark. I realise I could probably increase the timeout limit but what if the CSV file I'm working with one day is so big that increasing a timeout value no longer works?

Cheers,

John.

trustedsamurai  -  Oct 22, 2015

I'll just add to this as I am enjoying this conversation and others may benefit.
It turns out that I can't use the LOAD INFILE statement on mySQL where the CSV/data file is addressed with a URL. also, to make matters more painful I can't insert a LOAD statement into a mySQL stored procedure - which is what I would want to do to automate the process.

So it looks like I DO need to re-consider PHP after all.

Its the time-outs on PHP that concern me the most. It would be good if I could chunk the file but that really does look complicated. besides, the first file I'm testing with is not that big - around 650k.

Hawkee  -  Oct 22, 2015

Are you running the script on Server B through your web browser? This sort of script should be run from the command line as web browsers aren't really designed to handle processing tasks like this. Ideally you'll upload the CSV with a cron script running every so often looking for a new file. If one exists then the import can proceed.

trustedsamurai  -  Oct 22, 2015

I create the php file which resides on the web server (in a separate folder) then execute it from the URL.

This has been a great way to expand my understanding of PHP (and its interaction with mySQL) as the host environment supports all of these.

Are you saying the job might perform better if I were to schedule it with CRON?

Hawkee  -  Oct 22, 2015

You definitely want to run imports from the command line, so yes cron is ideal but you can manually do it via ssh every time you want to import as well.

trustedsamurai  -  Oct 22, 2015

I had better go and work out what I need to do to connect to my shared host with SSH.

Thanks for your help.

Hawkee  -  Oct 22, 2015

I always choose my shared hosts based on whether or not they offer SSH. Some don't and those that do will sometimes require a copy of your driver's license to make it available.

trustedsamurai  -  Oct 23, 2015

OK. I've used PUTTY to access my shared host folder via SSH. I have two filesets to upload. The first contains a list of fish species and is approx 5,000 rows. This completed reasonably quickly. That's good news! it was timing out before when run from the URL of the browser. (Thanks Mr Hawkee!).

The next file has 8 columns and 250,000 rows. its progressing at roughly 5,000 rows every 10 minutes so i figure it will take 8 hours to complete. When I used a LOAD INLINE command from mySQL Workbench this same file transfer completed in 10 minutes. So I know it can go quicker.

Any thoughts on how to speed this part up?

Thanks again,

John.

Hawkee  -  Oct 23, 2015

It depends on what your script is doing. Are you querying for existing rows before inserting? If so, are you querying against indexed fields?

trustedsamurai  -  Oct 23, 2015

I have a staging table on the database that is truncated before I start the transfer. the staging table has an id column that auto generates a unique number and this is the primary key. But no indexing on this table.

my script inserts row by row into the table.

Hawkee  -  Oct 23, 2015

Are there any select queries? That's the only thing I can imagine that would cause it to take so long to insert. Unless you're experiencing latency between the database server and your shared server.

trustedsamurai  -  Oct 23, 2015

not a select statement as such. but the process does work line by line which means its writing line by line to the database. I understand that block processing would be faster.

here is a subset of the code. You can see the top section grabs the current row of the CSV , then below it INSERTs.

if (($handle = fopen($csv_file, "r")) !== FALSE) {
   fgetcsv($handle);   
   while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
        $num = count($data);
        for ($c=0; $c < $num; $c++) {
          $col[$c] = $data[$c];
        }

 $col1 = $col[1];                       //SpeciesID
 $col2 = $col[2];                       //SurveyID
 $col3 = str_replace("'","",$col[3]);   //Longitude
 $col4 = str_replace("'","",$col[4]);   //Latitude
 $col5 = str_replace("'","",$col[5]);   //Realm
 $col6 = str_replace("'","",$col[6]);   //Country
 $col7 = str_replace("'","",$col[7]);   //State
 $col8 = str_replace("'","",$col[8]);   //Geometry

/ $col9 = str_replace("'","",$col[9]); //frequency
$col10 = str_replace("'","",$col[10]); //abundance
$col11 = str_replace("'","",$col[11]); //commonfamilyname
$col12 = str_replace("'","",$col[12]); //commonclassname
$col13 = str_replace("'","",$col[13]); //commonphylumname
$col14 = $col[14]; //lmax
/
// SQL Query to insert data into DataBase
$sql = "INSERT INTO locations(speciesid,locationid,longitude,latitude,realm,country,state) VALUES(".$col1.",'".$col2.",'".$col3."','".$col4."','".$col5."','".$col6."','".$col7."','".$col8."')";

Hawkee  -  Oct 23, 2015

Seems simple enough. Could it be a latency issue between servers causing the delay? You might just want to generate an import file and use the LOAD commands to bring it in. That'll be much faster.

trustedsamurai  -  Oct 23, 2015

Would you explain the context of the 'import file' and the 'LOAD' commands? I have tried a LOAD INFILE command from mySQL workbench and this only works with local files, not with url addresses.

Hawkee  -  Oct 23, 2015

If you can connect to the remote mySQL server from your PHP server then you can use "LOAD DATA LOCAL INFILE" when the file is on your PHP server.

trustedsamurai  -  Oct 23, 2015

Oh wow. I couldn't get the LOAD DATA LOCAL INFILE to work when running in the mySQL Workbench environment because the file being imported was addressed in a URL. But it never occurred to me to run this LOAD from the host server where the CSV file is residing.

results: 250,000 row CSV estimated to take 8 hours to load took 15 seconds!

Nice one.... Now I just need to get the CSV files from the geo-server via a URL to my host server where this new script can pick them up. Then CRON it and I'm away! :)

Thanks for the tip.

trustedsamurai  -  Oct 23, 2015

OK. While North America sleeps the real work gets done! {joke}.

i was able to use file_get_contents to save a local CSV file from a geo server generated file.

Everything is now running in a series of php scripts that are included in a single "master" script and all I have to do is schedule this "master" from cron.

The only thing that concerns me is error handling. I've got some try/catches, but admit I probably should build more in.

Anyway, it's looking very good. All thanks to you!

cheers.

John.

Hawkee  -  Oct 24, 2015

Awesome! I'm happy to have helped.

Sign in to comment

dustale08   -  May 06, 2015

So Should I just pop this into my Terminal?

Hawkee  -  May 07, 2015

Save this as a .php file and run it from the command line.

Sign in to comment

angelxmoreno   -  Nov 23, 2014

What does the $ccount variable do in the script?

Hawkee  -  Nov 23, 2014

Good question. I think that was for debugging purposes, but apparently I forgot to remove it before posting. I removed it.

Sign in to comment

shacker   -  Oct 27, 2014

The issue I hit with this script is that it assumes all data in your CSV will fit into a varchar field, limited to 255 characters. But some of my data needs to go into TEXT fields with unlimited length. Unfortunately there's no way for the script to know in advance what kinds of fields to create in the db. And because the final line of the script, which does the loading, does not report errors, it fails silently (very mysterious).

Hawkee  -  Oct 27, 2014

Yes, this is a very simple script. You can modify it to create each field as a text field then go back later and adjust the smaller fields to be varchars.

Sign in to comment

CyberMage   -  Jun 09, 2014

Love it - saved me some time. I had someone send me a 250 MB file with 57 tilde delimited fields that I needed to import, and PHPMyAdmin of course doesn't like big files. A couple of tweaks to your script and it worked like a charm.

 Respond  
radius314   -  Mar 29, 2014

Slight flaw in the script, as if the column name contains a reserved keyword. Say for example the word "DELETE". In this case the column variable should be wrapped in back ticks (`) to escape this.

Hawkee  -  Mar 29, 2014

Good catch, I've corrected the flaw.

radius314  -  Mar 29, 2014

cool

Hawkee  -  Mar 29, 2014

BTW, welcome to the site! How did you find us?

radius314  -  Mar 31, 2014

Google Search

Sign in to comment

Neo139   -  Aug 18, 2013

Thanks!
PHP Notice: Use of undefined constant argv - assumed 'argv' in import.php on line 14
PHP Notice: Undefined variable: columns in import.php on line 35
PHP Notice: Undefined variable: columns in import.php on line 36
to remove the last two warnings change the foreach for:
foreach($frow as $column) {
$ccount++;
if(isset($columns)) $columns .= ', ';
else $columns="";
$columns .= "$column bigint(20)";
}

Hawkee  -  Aug 27, 2013

@Neo139 Thank you, but why bigint rather than varchar?

Sign in to comment

Hawkee   -  Nov 11, 2012

@a6april You might want to try StackOverflow. Its a much better suited site for questions and answers. You can post concerning the parts where you need help, but I wouldn't ask people to write it for you.

 Respond  
a6april   -  Nov 11, 2012

Can you point me in the right direction to find out what those changes would be? I don't mind starting small and moving from there, I can email you the zip of everything I have done so far.

 Respond  
Hawkee   -  Nov 10, 2012

@a6april You'll probably need to make some changes to this to meet your needs.

 Respond  
a6april   -  Nov 10, 2012

Can't the table name be the name of the csv file? or a temp file in the code? I have seen it done before but can't find the instance in which I had done this. I will look again harder to find it.

Ultimately I have a csv file that a user will upload then I can run queries on it. There are many different csv files slightly different in format, so I would then have to create a script to tailor each scenario? I think that means a new table for each file template?

 Respond  
Hawkee   -  Nov 08, 2012

@a6april This is a command line script, so you'll have to run it from your shell. It will not work through a browser. It takes two parameters: csv file and the table name to be created.

 Respond  
a6april   -  Nov 08, 2012

How do I add a browse button for this?

Lets say I need to select the file can I use something like:

//get the csv file
$file = $_FILES[csv][tmp_name];
$handle = fopen($file,"r");

With a form like this:

Choose your file:

I tried using the script as is and I always get the error:
'Please provide a file name'

I would really like to get this working so any additional information would be greatly appreciated

tekagami  -  Oct 31, 2014

How about:

$uploadfile = basename($_FILES['csv']['name']);

if ($uploadfile != "") {
$file= $uploadfile;
move_uploaded_file($_FILES['csv']['tmp_name'], "/path/to/".$file);
}
Hawkee  -  Oct 31, 2014

@tekagami Be sure to wrap your code with ``` to ensure it shows up properly.

acosonic  -  Aug 26, 2015

Well the script kind'a makes few gltiches on windoze... To fix it
remove notices by:
line 14 replace$_SERVER[argv]; with $_SERVER['argv'];
After line 30 $frow = fgetcsv($fp);
Add $columns=false;
And line 42 now fix path (Windows only though) like this

$file = addslashes(realpath(dirname(__FILE__)).'\\'.$file);
For Linodoze just use / instead of \ ...

THANKS!

gnovaro  -  Nov 04, 2015

I made some modification to improve your code, this is my version

<?php
/**
 * http://hawkee.com/snippet/8320/
 * 
 * Installation
 * ------------
 * People who are using mysql version 5 and above.
 *
 * Before loading the file into mysql must ensure that below tow line are added in side etc/mysql/my.cnf
 *
 * to edit my.cnf command is 
 * sudo vi /etc/mysql/my.cnf
 *
 * [mysqld]  
 * local-infile
 *
 * [mysql]  
 * local-infile 
 * 
 * Example manual run from console
 * mysql -u root -proot
 * mysql use base;
 * load data local infile '/home/gustavo/www/maps-importer/hotel_all_active_09-10-14.csv' into table hotel_data fields terminated by ';' ignore 1 lines;
 */
error_reporting(E_ALL);
$host = 'localhost';
$user = 'root';
$pass = 'root';
$database = 'mapshotels';
$delimiter = ';'; //By default ,

$db = mysqli_connect($host, $user, $pass,$database);

if (!$db) {
    echo "Error: Unable to connect to MySQL." . PHP_EOL;
    echo "Debugging errno: " . mysqli_connect_errno() . PHP_EOL;
    echo "Debugging error: " . mysqli_connect_error() . PHP_EOL;
    exit;
}

/********************************************************************************/
// Parameters: filename.csv table_name

$argv = $_SERVER['argv'];

if($argv[1]) { $file = $argv[1]; }
else {
    echo "Please provide a file name\n"; exit; 
}
if($argv[2]) { $table = $argv[2]; }
else {
    $table = pathinfo($file);
    $table = $table['filename'];
}

/********************************************************************************/
// Get the first row to create the column headings

$fp = fopen($file, 'r');
$frow = fgetcsv($fp,0,$delimiter);
$columns = '';
foreach($frow as $column) {
    if($columns) $columns .= ', ';
    $columns .= "`$column` varchar(191)"; //Max varchar for utf8mb4!!
}

$create = "create table if not exists $table ($columns);";
//Debug
echo $create."\n";
mysqli_query($db,$create);

/********************************************************************************/
// Import the data into the newly created table.

$file = $_SERVER['PWD'].DIRECTORY_SEPARATOR.$file;
$q = "load data local infile '$file' into table $table fields terminated by '$delimiter' ignore 1 lines";
//Debug
echo $q."\n";
mysqli_query($db,$q);
Sign in to comment

Are you sure you want to unfollow this person?
Are you sure you want to delete this?
Click "Unsubscribe" to stop receiving notices pertaining to this post.
Click "Subscribe" to resume notices pertaining to this post.