IBepyProgrammer

Convert Excel Files to CSV

3 min read
Convert Excel Files to CSV

Suppose you are working on a project with large Excel files which contain a lot of data. Then suppose you are working with Power BI and connect the data from the Excel spreadsheet to Power Query and the data from the spreadsheet is large and with large data transformations, the performance is greatly impacted. One way to solve this problem is to convert the Excel spreadsheet into CSV files. This process will improve the performance.

In this article, we will build a simple Python script that will automate the conversion of Excel spreadsheets into CSV files.

The process for writing this script involves:

  • Navigate to the directory containing the files.

  • Once in the directory the Python script will only take the Excel files in the directory.

  • The script will then loop through every single Excel file and within the file convert every sheet into a CSV file. The script will work in converting a single sheet or multiple sheets within a single Excel file.

  • To avoid any name conflict, we will then concatenate the name of the Excel file to the name of the sheets in the Excel file. This will give us the names of sheets that are unique to the primary Excel file.

To start we will open up our code editor and for this article, we will be using Visual Studio code.

We will also be using the OS and Pandas libraries in Python. OS module is a standard utility that comes bundled in Python and provides functions that allow interaction with the operating system. In our case, we will be using the module to navigate directories.

Pandas on the other hand is a module that provides functions that are useful in cleaning, exploring, manipulating, and analyzing data sets. To install Pandas, we can open up the terminal and type in, pip install pandas.

import os
import pandas as pd

Let's add the path to where all the files are located.

path = "D:\Projects\Excel to CSV"

We can then check the contents of the folder specified in the path above.

files = os.listdir(path)
print(files)

We can then loop through the contents of the folder and only select the Excel files.

The code below checks if the file extension ends with the ".xlsx". If so then it will print out all files with that extension.

We can remove the extension of the Excel file in the final CSV file name.

Note that we only have the names of the files.

We then convert the names into Excel files using the Pandas library.

We then display the individual sheets present in the Excel files and print them out.

We can then take the data from each sheet in the Excel files, convert it to CSV files, and store it in the current folder.

This involves looping through each sheet in each Excel file and converting the data into a CSV file.

After we parse through the data in the sheets, we then grab the data and write it back into a CSV file.

Note that in the parenthesis above, we have written the created CSV file name to be dynamic.

This is because we know the variable "clean_name" contains the file name and "ind_sheet" contains the sheet name.

We then concatenate the file name and the sheet name to create unique names for each CSV file created.

for excelfile in files:
    if excelfile.endswith(".xlsx"):
        clean_name = excelfile.replace(".xlsx", "")
        print(excelfile)

        xlsx_file = pd.ExcelFile(excelfile)
        sheets = xlsx_file.sheet_names
        print(sheets)

        for ind_sheet in sheets:
            sheet_data = xlsx_file.parse(ind_sheet)
            csv_file_name = clean_name + "-" + ind_sheet + ".csv"
            sheet_data.to_csv(csv_file_name, index = False)

Conclusion

In this article, we learn how we can convert Excel files and corresponding Excel sheets into CSV files by writing a simple Python script. You can also find the code on my GitHub repository https://github.com/IBepyProgrammer/Convert-Excel-to-CSV.

If you found this article helpful consider subscribing and sharing.

Thank you.

Sign up for our newsletter

Don't miss anything. Get all the latest posts delivered to your inbox. No spam!