The JSON trick 25% of Python devs don’t know about
Reading and writing JSON files is a common chore in Python. In fact our analysis of 888 codebases found that 17% of repositories either read from or wrote to JSON files. Of those 150 codebases 25% were making their lives harder than necessary by not using a built-in JSON feature. A key part of our job as devs is to reduce the complexity of the code we write, so this is a good area of improvement for one in four Python devs.
At the end of this article is a link to the stats. You may want to check the stats. You never know, your repo might be one of the 888 we checked!
Many of the occurrences were in big open source projects, so this is not just a rookie mistake. For fun we listed some notable examples near the end of the article.
Reading JSON files
Have you ever wondered why json.loads
ends with a “s”? The reason is because that method is specifically for working on a string, and there is another related method for working on files: json.load
.
json.loads
is for deserialising a string containing JSON, while json.load
(no "s") is for deserialising a file containing JSON.
The methods are very related. In fact, json.load
is a wrapper around json.loads
. It's a method provided out of the box by Python to simplify the task of reading JSON from file-like objects. For example, these result in the same outcome:
# bad
with open(‘some/path.json’, ‘r’) as f:
content = json.loads(f.read())# good
with open(‘some/path.json’, ‘r’) as f:
content = json.load(f)
So why not choose the easiest to read and write?
Writing JSON files
Similarly to json.loads
, json.dumps
also ends with an "s" because it's specifically for working on a string and there is another related method:json.dump
json.dumps
is for JSON serialising an object to a string, while json.dump
(no "s") is for JSON serialising an object to a file.
Again similar to the relationship between json.load
and json.loads
, json.dump
is a wrapper around json.dumps
. It's a method provided out of the box by Python to simplify the task of writing JSON to a file-like object.
For example, these result in the same outcome:
# bad
with open('some/path.json', 'w') as f:
f.write(json.dumps({'foo': 'bar'}))# good
with open('some/path.json', 'w') as f:
json.dump({'foo': 'bar'}, f)
So why not choose the easiest to read and write?
Reduce complexity
Python core developers are very careful about what features are included in the Python builtin modules because once they expose an API to devs it’s very hard for them to remove it. So we’ve been given some tools that simplify our handling of JSON files so it makes sense to utilise the convenience methods and not reinvent the wheel. For that reason when Code Review Doctor sees devs not using json.load
and json.dump
this advice is given:
Examples
Many of the occurrences were in big open source projects, so this is not just a mistake rookies make, but rather is a often overlooked feature of Python:
- Microsoft
- Ansible
- Unicef
- Sentry
- UK Department for International Trade
Stats
You can read the gists:
Reading JSON: json.load vs json.loads
Writing JSON: json.dump vs json.dumps
How did we find this statistic? At CodeReview.doctor we run static analysis checks on codebases to find bugs and mistakes most humans miss. During this process we collect stats on the public repos. We tracked every line we could find where developers interacted with JSON files. We were surprised by the results!
If you would like to check if you’re affected by issues like this one, check your codebase instantly for free at CodeReview.doctor.