Convert CSV Files to SQL Scripts (INSERT, UPDATE, DELETE) and Export to .SQL
File in Java
Many times, when working with large datasets stored in CSV files, especially those with up to a million records, it’s common to need to migrate or update that data in a relational database. Depending on the context, you may need to insert new records, update existing ones, or even delete some records from the database.
In this tutorial, we’ll walk through the process of converting a CSV file into different SQL scripts (INSERT
, UPDATE
, and DELETE
) using Java, and then export those scripts into a .sql
file. We'll ensure the program is optimized for large datasets by handling the file writing process efficiently, let’s go.
In the comments is the link so you can read it if you are not members.
Prerequisites
Before we start, make sure you have the following:
- Java installed on your machine.
- A CSV file containing your data.
- A basic understanding of Java programming and SQL.
Here’s an example of the structure of the CSV file:
id,name,email,level
1,Fredy Watt,fredy.watt@example.com,30
2,Jade Smith,jade.smith@example.com,25
3,Bob sponge,bob.sponge@example.com,40
Step 1: Reading the CSV File
First, we’ll write a method to read the CSV file. In Java, we can use the BufferedReader
to process the file line by line.
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
public class CSVReader {
public static List<String[]> readCSV(String filePath) {
List<String[]> records = new ArrayList<>();
ClassPathResource resource = new ClassPathResource(filePath);
try (BufferedReader br = new BufferedReader(new InputStreamReader(resource.getInputStream()))) {
String line;
while ((line = br.readLine()) != null) {
String[] values = line.split(",");
records.add(values);
}
} catch (IOException e) {
e.printStackTrace();
}
return records;
}
}
This method reads each line of the CSV file, splits it by commas, and stores the resulting data in a list of string arrays.
Step 2: Generating SQL INSERT
, UPDATE
, and DELETE
Statements
We’ll now create methods to generate SQL scripts for INSERT
, UPDATE
, and DELETE
queries, in this same way the select could be generated but it was omitted for this example.
public class SQLGenerator {
public static String generateInsert(String tableName, String[] headers, String[] values) {
StringBuilder query = new StringBuilder("INSERT INTO ");
query.append(tableName).append(" (");
for (int i = 0; i < headers.length; i++) {
query.append(headers[i]);
if (i < headers.length - 1) query.append(", ");
}
query.append(") VALUES (");
for (int i = 0; i < values.length; i++) {
query.append("'").append(values[i]).append("'");
if (i < values.length - 1) query.append(", ");
}
query.append(");");
return query.toString();
}
public static String generateUpdate(String tableName, String[] headers, String[] values, String idColumn, String idValue) {
StringBuilder query = new StringBuilder("UPDATE ");
query.append(tableName).append(" SET ");
for (int i = 0; i < headers.length; i++) {
query.append(headers[i]).append(" = '").append(values[i]).append("'");
if (i < headers.length - 1) query.append(", ");
}
query.append(" WHERE ").append(idColumn).append(" = '").append(idValue).append("';");
return query.toString();
}
/**
*
* @param tableName name of de table that build script.
* @param idColumn name of identification.
* @param idValue value of the identification colum.
* @return
*/
public static String generateDelete(String tableName, String idColumn, String idValue) {
return "DELETE FROM " + tableName + " WHERE " + idColumn + " = '" + idValue + "';";
}
}
These methods will generate SQL queries based on the CSV data. The INSERT method generates an INSERT INTO statement, while the UPDATE method creates an UPDATE query with a WHERE clause and the DELETE method creates a DELETE statement based on the record ID, this can be very useful when migrating data from a database. in excel to a SQL base.
Step 3: Writing SQL Queries to a .sql
File
Now, instead of printing the queries to the console, we’ll write them to a .sql
file. Since the dataset can be large (up to a million records), we’ll use a BufferedWriter
to handle the file writing efficiently.
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.util.List;
/**
* Class of build sql file
*/
public class SQLFileWriter {
public static void writeSQLFile(String filePathResult, List<String[]> records, String tableName, String idColumn) {
File directory = new File("src\\main\\resources");
if (!directory.exists()) {
boolean dirCreated = directory.mkdirs(); // Create directory if not exists
if (!dirCreated) {
return;
}
}
// Create a complete route for the file
String filePath = directory.getPath() + File.separator + filePathResult;
try (BufferedWriter writer = new BufferedWriter(new FileWriter(new File(filePath)))) {
String[] headers = records.get(0); // First row is headers (columns of table)
for (int i = 1; i < records.size(); i++) {
String[] values = records.get(i);
String idValue = values[0]; // Assuming the first column is the 'id'
// Generate INSERT query
String insertQuery = SQLGenerator.generateInsert(tableName, headers, values);
writer.write(insertQuery);
writer.newLine();
System.out.println("Generating insert..");
// Generate UPDATE query
String updateQuery = SQLGenerator.generateUpdate(tableName, headers, values, idColumn, idValue);
writer.write(updateQuery);
writer.newLine();
System.out.println("Generating update.");
// Generate DELETE query
String deleteQuery = SQLGenerator.generateDelete(tableName, idColumn, idValue);
writer.write(deleteQuery);
writer.newLine();
System.out.println("Generating delete");
// Flush periodically to handle large files efficiently
if (i % 10000 == 0) {
writer.flush();
}
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
Step 4: Main Class to Run the Process
Finally, we’ll create the main class that ties everything together. This class reads the CSV file and writes the SQL queries to the output .sql
file.
public class CSVtoSQLConverter {
public static void main(String[] args) {
String csvFilePath = "data.csv"; // Path to your CSV file
String sqlFilePath = "output.sql"; // Output SQL file
String tableName = "users"; // your table name
String idColumn = "id"; //name the column used to identify the record
// Read CSV file
List<String[]> records = CSVReader.readCSV(csvFilePath);
// Write SQL file
SQLFileWriter.writeSQLFile(sqlFilePath, records, tableName, idColumn);
System.out.println("SQL file generated successfully!");
}
}
Step 5: Running the Code
When you run the program, it will create an output.sql
file containing all the INSERT
, UPDATE
, and DELETE
queries based on the CSV data.
Here’s an example of the content of the .sql
file:
Optimizing for Large Datasets
Handling a CSV file with up to a million records can cause memory issues if not optimized. Here are some strategies we implemented to optimize the process:
- BufferedReader/BufferedWriter: We use
BufferedReader
for reading the CSV file andBufferedWriter
for writing the.sql
file to minimize memory usage. - Flushing the Writer: For large datasets, we periodically flush the
BufferedWriter
every 10,000 records to prevent the buffer from growing too large. - Efficient Memory Handling: Since we are processing the records sequentially and writing them to a file, the memory footprint is kept low.
Conclusion
By following this approach, you can efficiently convert a CSV file into SQL scripts for INSERT
, UPDATE
, and DELETE
operations, and write them to a .sql
file. This solution scales well even for large datasets of up to a million records.
With these optimizations in place, your application can handle large volumes of data while maintaining good performance and low memory usage. You can further enhance the logic to meet your specific requirements, such as adding data validation or dealing with edge cases in the CSV file.
Happy coding!