Saturday, August 7, 2021

Convert dataframe column of strings to lists and vice versa

Dataframe columns will commonly load as object/string type. However, if you want separate the rows into individual columns using this expression df = df_melt['col_name'].apply(pd.Series) then there is a need to convert the column of strings to lists.

Read the data into a dataframe. Here we got multiple languages under the language column that we want to convert into list for further analysis.


So, for each row/cell what we have is something like this ('English\nCantonese\nMandarin\nFrench') and what we wanted is this (['English', 'Cantonese', 'Mandarin', 'French']), vice versa.

# Convert 'String' column to list... Another solution: df_melt = df.assign(Language_list_melt = df.Language.str.split("\n"))
def str_to_list(a_str, seperator='\n'):
    return a_str.split(seperator)

df['Language_list'] = df['Language'].apply(lambda x:str_to_list(x))
# -------------------------------

# Convert 'List' column to string...
def list_to_str(a_list):
    return ", ".join(a_list)

df['Language_str'] = df['Language_list'].apply(lambda x:list_to_str(x))

That is it!

No comments:

Post a Comment