Script to copy csv data to a "2d hash as json"



I have a habit of running into ill formatted data and using python to get what I actually want. Python makes it super easy to just read a file, csv or otherwise, and create new files.

I was recently given a nasty xls file, which I was luckily able to copy data out of and create a csv file.
But I needed this data for some front end js widget. So I figured I would want, what in python I would call a 2d dictionary/hash, in js it is basically just a object, or object of objects, so I want a json file.

Take a look at, the input file and output file are there as well.

It works pretty great for my use, but it isn’t clean, should be broken up more.

Let me know if you have any thoughts on my python, or just other better solutions.
I just figured someone might get value from seeing something like this.
If anyone is interested we can clean it up together.


Try to avoid string concatenation (use of the + operator) like that. Either use string formatting or better yet use the json module to format your output.

My recommendation: If the CSV file is small, you could load it all into memory as a Python dict where each value is also a dict. You would then just use json dump the top-level dict to output. Given that you intend the result to be used by a js widget, the data is probably small enough to use this approach. This approach is also the most readable.

If the CSV file is too large, you might need to generate the output one record at a time. In this case, generate a Python dict for the input row and then use the json module to convert that to a str for output.


Ah yeah the json module, I can’t believe I forgot about that!

Otherwise yeah I would like to clean up all that string building.
That reminds me of a relevant link that I like: