Friday, November 20, 2020

Pandas Dataframe to JSON

 Pandas has a function (.to_json()) that takes a series/dataframe and converts it into JSON object.

It has an interesting parameter (orient) that allow different orientations of the JSON string format. Allowed values are: {'split','records','index','columns','values','table'}. Let see how it works.

Assuming we have the table below in out dataframe and now we want it as a JSON data for use on another web platform (JSON work better on web than tabular data).



As usual, we read the data table into a dataframe variable (here it is called: json_df).


Then the .to_json() function is called on the dataframe to convert it to JSON like this:-

json_df.to_json()

There are many parameters we can pass into the function such as path to save the JSON and orient to orient the JSON in different formats. Below, we will explore the various orient options available.


As you can see, each orient value has its own different JSON format or structure.

print(json_df.to_json(orient='columns'))
output = {"Question":{"0":"In what country is the city 'Tokyo'?","1":"In what country is the city 'Delhi'?","2":"In what country is the city 'Cairo'?","3":"In what country is the city 'Chongqing'?","4":"In what country is the city 'Orlando'?","5":"In what country is the city 'Abuja'?"},"Option 1":{"0":"Japan","1":"India","2":"Argentina","3":"Philippines","4":"Mexico","5":"Spain"},"Option 2":{"0":"India","1":"United States","2":"Bangladesh","3":"Pakistan","4":"Angola","5":"Nigeria"},"Option 3":{"0":"Brazil","1":"China","2":"Egypt","3":"China","4":"United States","5":"Canada"}}


print(json_df.to_json(orient='split'))
output = {"columns":["Question","Option 1","Option 2","Option 3"],"index":[0,1,2,3,4,5],"data":[["In what country is the city 'Tokyo'?","Japan","India","Brazil"],["In what country is the city 'Delhi'?","India","United States","China"],["In what country is the city 'Cairo'?","Argentina","Bangladesh","Egypt"],["In what country is the city 'Chongqing'?","Philippines","Pakistan","China"],["In what country is the city 'Orlando'?","Mexico","Angola","United States"],["In what country is the city 'Abuja'?","Spain","Nigeria","Canada"]]}


print(json_df.to_json(orient='records'))
output = [{"Question":"In what country is the city 'Tokyo'?","Option 1":"Japan","Option 2":"India","Option 3":"Brazil"},{"Question":"In what country is the city 'Delhi'?","Option 1":"India","Option 2":"United States","Option 3":"China"},{"Question":"In what country is the city 'Cairo'?","Option 1":"Argentina","Option 2":"Bangladesh","Option 3":"Egypt"},{"Question":"In what country is the city 'Chongqing'?","Option 1":"Philippines","Option 2":"Pakistan","Option 3":"China"},{"Question":"In what country is the city 'Orlando'?","Option 1":"Mexico","Option 2":"Angola","Option 3":"United States"},{"Question":"In what country is the city 'Abuja'?","Option 1":"Spain","Option 2":"Nigeria","Option 3":"Canada"}]


print(json_df.to_json(orient='values'))
output = [["In what country is the city 'Tokyo'?","Japan","India","Brazil"],["In what country is the city 'Delhi'?","India","United States","China"],["In what country is the city 'Cairo'?","Argentina","Bangladesh","Egypt"],["In what country is the city 'Chongqing'?","Philippines","Pakistan","China"],["In what country is the city 'Orlando'?","Mexico","Angola","United States"],["In what country is the city 'Abuja'?","Spain","Nigeria","Canada"]]


print(json_df.to_json(orient='table'))
output = {"schema": {"fields":[{"name":"index","type":"integer"},{"name":"Question","type":"string"},{"name":"Option 1","type":"string"},{"name":"Option 2","type":"string"},{"name":"Option 3","type":"string"}],"primaryKey":["index"],"pandas_version":"0.20.0"}, "data": [{"index":0,"Question":"In what country is the city 'Tokyo'?","Option 1":"Japan","Option 2":"India","Option 3":"Brazil"},{"index":1,"Question":"In what country is the city 'Delhi'?","Option 1":"India","Option 2":"United States","Option 3":"China"},{"index":2,"Question":"In what country is the city 'Cairo'?","Option 1":"Argentina","Option 2":"Bangladesh","Option 3":"Egypt"},{"index":3,"Question":"In what country is the city 'Chongqing'?","Option 1":"Philippines","Option 2":"Pakistan","Option 3":"China"},{"index":4,"Question":"In what country is the city 'Orlando'?","Option 1":"Mexico","Option 2":"Angola","Option 3":"United States"},{"index":5,"Question":"In what country is the city 'Abuja'?","Option 1":"Spain","Option 2":"Nigeria","Option 3":"Canada"}]}


print(json_df.to_json(orient='index'))
output = {"0":{"Question":"In what country is the city 'Tokyo'?","Option 1":"Japan","Option 2":"India","Option 3":"Brazil"},"1":{"Question":"In what country is the city 'Delhi'?","Option 1":"India","Option 2":"United States","Option 3":"China"},"2":{"Question":"In what country is the city 'Cairo'?","Option 1":"Argentina","Option 2":"Bangladesh","Option 3":"Egypt"},"3":{"Question":"In what country is the city 'Chongqing'?","Option 1":"Philippines","Option 2":"Pakistan","Option 3":"China"},"4":{"Question":"In what country is the city 'Orlando'?","Option 1":"Mexico","Option 2":"Angola","Option 3":"United States"},"5":{"Question":"In what country is the city 'Abuja'?","Option 1":"Spain","Option 2":"Nigeria","Option 3":"Canada"}}


The structure you will adopt will depend on where you want to use it. Usually, you would find sample JSON structure from the platform you intend to use. Compare the option and adopt the one that suit your project.

That is it!

No comments:

Post a Comment