Skip to main content

CSV to SQL: Generate Type-Inferred INSERT Statements Safely

A practical guide to converting CSV to SQL — type inference, single-quote escaping against injection, batched vs per-row INSERTs, and bulk imports into MySQL, PostgreSQL, and SQLite.

Published By 李雷
#csv #sql #database #developer-tools

CSV to SQL: Generating INSERT Statements That Actually Load

Every developer hits this the same way. Someone hands you a spreadsheet, says "load this into the database," and now you're staring at 400 rows wondering whether to hand-write a CREATE TABLE, fight your client's import wizard, or write a throwaway script. Converting CSV to SQL by hand is the kind of chore that looks trivial until a single apostrophe in a customer name breaks the whole batch.

This guide walks through what a real CSV-to-SQL conversion has to get right: inferring column types from the data, escaping values so the SQL parses, choosing between batched and per-row INSERTs, and importing the result without a server round-trip. The examples below use the CSV to SQL Converter, which runs entirely in the browser.

From a CSV row to an INSERT statement

Start with the smallest possible example. Here's a three-row CSV with a header:

name,price,in_stock,added
Widget,9.99,true,2026-01-05
O'Brien Mug,14.50,false,2026-02-11
Bolt,0.30,true,2026-03-02

Paste that, set the table name to products, pick PostgreSQL, and you get:

CREATE TABLE "products" (
  "name" VARCHAR(16),
  "price" DECIMAL,
  "in_stock" BOOLEAN,
  "added" DATE
);

INSERT INTO "products" ("name", "price", "in_stock", "added") VALUES
  ('Widget', 9.99, TRUE, '2026-01-05'),
  ('O''Brien Mug', 14.50, FALSE, '2026-02-11'),
  ('Bolt', 0.30, TRUE, '2026-03-02');

Notice three things the converter did without being told. The price column became DECIMAL because every value has a fractional part. O'Brien became 'O''Brien' — the apostrophe doubled so the string closes cleanly. And true/false rendered as TRUE/FALSE because PostgreSQL has a real boolean type. Switch the dialect to MySQL and the same booleans come out as 1/0 with backtick-quoted identifiers instead.

How type inference works (and where it bites)

The converter scans each column across all rows before deciding a type. If every non-empty cell is an integer, the column becomes INT; if they're all numeric with decimals, DECIMAL; true/false/1/0 patterns become BOOLEAN; anything matching YYYY-MM-DD becomes DATE; and everything else falls back to VARCHAR(n), where n is the next power of two above your longest value. One non-conforming cell in a column drops the whole column to VARCHAR, so a single stray value never produces an INSERT that fails to load.

That fallback is a safety feature, but inference still has a famous trap worth calling out. A column of US ZIP codes like 02134 is all digits, so it infers as INT — and INT silently drops the leading zero, turning 02134 into 2134. The same happens to phone numbers, SKUs with numeric prefixes, and any ID where formatting carries meaning. Always skim the generated CREATE TABLE and force VARCHAR on code-style columns. Inference is a starting point for a schema, not a substitute for reading it.

Escaping, NULLs, and the SQL-injection question

Value escaping is where generated SQL most often goes wrong, so it's worth being precise. The standard SQL escape for a single quote inside a string is to double it — ' becomes '' — and then wrap the whole value in single quotes. That's exactly what a parameterized database driver does internally, so O'Brien'O''Brien' is correct and safe to paste. Numbers and booleans stay unquoted; empty cells become NULL by default, which you can toggle off if a NOT NULL column genuinely wants empty strings instead.

One honest caveat about injection. According to the OWASP injection guidance, the durable defense against SQL injection is bound parameters, not careful string escaping. Generating literals from a CSV you control — a spreadsheet export, a test fixture, a seed file — is perfectly fine because you trust the source. But if your CSV originates from untrusted user input, don't paste generated literals into a production query path; route that data through prepared statements. The converter does correct escaping; it does not turn hostile input into a safe query.

Batched vs. per-row INSERTs

The converter offers two output shapes, and the right one depends on what you're doing with the file.

Batched mode emits a single statement — INSERT INTO t (...) VALUES (...),(...),(...); — which is the fastest way to bulk-load thousands of rows because the database parses and plans one statement instead of hundreds. Reach for this when you're seeding a dev database or importing a large export.

Per-row mode emits one statement per data row. It's slower to load but far friendlier to version control: when someone edits a fixture, the diff shows exactly which row changed, and a reviewer can comment out a single line to isolate a flaky test case. Keep a CSV under version control, convert with "One INSERT per row" on, and commit the generated .sql next to the test that consumes it.

Loading the result

Because there's no server round-trip, you copy the SQL or download a .sql file and pipe it straight into your database CLI:

# PostgreSQL
psql mydb < products.sql

# MySQL
mysql mydb < products.sql

# SQLite
sqlite3 app.db < products.sql

I leaned on this recently while moving a small lookup table out of an old MySQL box into SQLite for an offline build. I dumped the source table to CSV, pasted it, flipped the dialect to SQLite, and the identifier quoting changed from backticks to double quotes and the booleans rerendered as 1/0 automatically. What would have been twenty minutes of find-and-replace was one paste and a sqlite3 app.db < migrate.sql. The generated CREATE TABLE wasn't my final schema, but it was a correct skeleton I could tweak in thirty seconds rather than type from scratch.

If your CSV needs cleanup before conversion — different delimiters, a tabular preview, or a JSON intermediate — the CSV to JSON converter is a useful companion for inspecting structure first. And once the SQL is generated, the SQL formatter will pretty-print a dense batched INSERT into something readable before you commit it.

Quick checklist before you run it

  • Confirm your CSV has a real header row — the first line becomes your column names, and a missing header yields col_1, col_2.
  • Skim the inferred types and force VARCHAR on any ZIP, phone, or ID column with leading zeros.
  • Decide NULL-on-empty vs. empty-string based on your column constraints.
  • Pick batched for speed, per-row for reviewable fixtures.
  • Don't paste a share URL with real customer data into a public channel — the input syncs into the link.

Get those right and CSV-to-SQL stops being a chore and becomes a one-paste step that loads the first time, apostrophes and all.


Made by Toolora · Updated 2026-06-13