Tuesday, December 29, 2020

Count Numbers of Rows from bulk excel files

Here I got many excel files in a folder and I wanted to know the number of rows in each file.

import glob
import pandas as pd
files = glob.glob("C:\\Users\\Yusuf_08039508010\\Desktop\\LinkedIn Profile Scrapping\\Old Project\\Second Project\\New folder\\*.xlsx")

# Get file names
file_names = [os.path.basename(x) for x in files]
# ---------------------

count_list = []
i = 0

for f in files:
    df = pd.read_excel(f)
    # count rows based on column index...
    rows_count = df.count()[0] # this will not count NaN

    data = file_names[i], rows_count # Combine filename and row count into a dataframe
    
    count_list.append(data)
    
    i = i+1

df_count = pd.DataFrame(count_list, columns=['File', 'Count'])

df_count

 This will return a table with each files name and corresponding number of rows in the file as seen below.


Enjoy!

No comments:

Post a Comment