[python] BadZipFile: File is not a zip file in read_excel

Asked 2 weeks ago, Updated 2 weeks ago, 1 views

: Load 12 csv files and make them one file All 12 files are csv files, but I opened it with read_excel because there was an encoding (utf-8, cp949, euc-kr, and ansi all) error. Then I opened it normally and put read_excel in the function, and I got an engine error, so I added engine='openpyxl'.

def open_file(year):
    files = glob.glob('../data/10_Procurement history/{}/*.csv'.format(year)))
    print(files)

    total_data =[]

    for i in files:       
        file = pd.read_excel(i , engine='openpyxl') 
        total_data.append(file)
        final = pd.concat(total_data)
    return final
data_2020 = open_file(2020)
data_2020.head()

BadZipFile: File is not a zip file

python

2022-09-20 11:13

2 Answers

If it is a csv file, call the file with the open command, and then print out the first line.

I've been through this lately. In the case of a csv file, there are cases where the cells are not separated like a normal Excel file, but text files consisting of the following formula.

A1\tB1\tC1\t..


2022-09-20 11:13


    for i in files:       
        file = pd.read_excel(i , engine='openpyxl') 
        total_data.append(file)
    final = pd.concat(total_data) # <------ Invalid indentation here.
    return final


2022-09-20 11:13

If you have any answers or tips


© 2022 pinfo. All rights reserved.